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ABSTRACT 


Title  of  Dissertation:  MODELING  AND  REDUCTION  WITH 

APPLICATIONS  TO  SEMICONDUCTOR 
PROCESSING 


Andrew  J.  Newman,  Doctor  of  Philosophy,  1999 


Dissertation  directed  by:  Professor  P.  S.  Krishnaprasad 

Department  of  Electrical  and  Computer  Engineering 


This  thesis  consists  of  several  somewhat  distinct  bnt  connected  parts,  with 
an  underlying  motivation  in  problems  pertaining  to  control  and  optimization  of 
semiconductor  processing.  The  first  part  (Chapters  3  and  4)  addresses  problems 
in  model  reduction  for  nonlinear  state-space  control  systems.  In  1993,  Scherpen 
generalized  the  balanced  truncation  method  to  the  nonlinear  setting.  However, 
the  Scherpen  procedure  is  not  easily  computable  and  has  not  yet  been  applied 
in  practice.  We  offer  a  method  for  computing  a  working  approximation  to  the 
controllability  energy  function,  one  of  the  main  objects  involved  in  the  method. 
Moreover,  we  show  that  for  a  class  of  second-order  mechanical  systems  with  dis¬ 
sipation,  under  certain  conditions  related  to  the  dissipation,  an  exact  formula  for 


the  controllability  function  can  be  derived.  We  then  present  an  algorithm  for  a 
numerical  implementation  of  the  Morse-Palais  lemma,  which  produces  a  local  co¬ 
ordinate  transformation  under  which  a  real-valued  function  with  a  non-degenerate 
critical  point  is  quadratic  on  a  neighborhood  of  the  critical  point.  Application 
of  the  algorithm  to  the  controllabilty  function  plays  a  key  role  in  computing  the 
balanced  representation.  We  then  apply  our  methods  and  algorithms  to  derive 
balanced  realizations  for  nonlinear  state-space  models  of  two  example  mechanical 
systems:  a  simple  pendulum  and  a  double  pendulum. 

The  second  part  (Chapter  5)  deals  with  modeling  of  rapid  thermal  chemical 
vapor  deposition  (RTCVD)  for  growth  of  silicon  thin  films,  via  first-principles  and 
empirical  analysis.  We  develop  detailed  process-equipment  models  and  study  the 
factors  that  influence  deposition  uniformity,  such  as  temperature,  pressure,  and 
precursor  gas  flow  rates,  through  analysis  of  experimental  and  simulation  results. 
We  demonstrate  that  temperature  uniformity  does  not  guarantee  deposition  thick¬ 
ness  uniformity  in  a  particular  commercial  RTCVD  reactor  of  interest.  In  the 
third  part  (Chapter  6)  we  continue  the  modeling  effort,  specializing  to  a  control 
system  for  RTCVD  heat  transfer.  We  then  develop  and  apply  ad-hoc  versions  of 
prominent  model  reduction  approaches  to  derive  reduced  models  and  perform  a 
comparative  study. 


MODELING  AND  REDUCTION  WITH  APPLICATIONS 


TO  SEMICONDUCTOR  PROCESSING 


by 

Andrew  J.  Newman 


Dissertation  submitted  to  the  Faculty  of  the  Graduate  School  of  the 
University  of  Maryland,  College  Park  in  partial  fulfillment 
of  the  requirements  for  the  degree  of 
Doctor  of  Philosophy 
1999 


Advisory  Committee: 

Professor  P.  S.  Krishnaprasad,  Chairman 
Professor  Carlos  A.  Berenstein 
Professor  William  S.  Levine 
Professor  Steven  I.  Marcus 
Professor  Shihab  A.  Shamma 


©Copyright  by 
Andrew  J.  Newman 


1999 


PREFACE 


“What  have  you  done  for  science?” 


— P.  S.  Krishnaprasad 


This  question  has  been  posed  to  me  repeatedly  over  the  past  five  years.  With 
this  thesis,  I  finally  offer  an  answer. 

The  contents  herein  take  some  small  steps  toward  making  nonlinear  balanc¬ 
ing,  a  new  model  reduction  method  introduced  by  other  researchers,  accessible  for 
use  in  practical  applications.  I  hope  that  this  work  is  followed  by  improvements, 
and  leads  to  additional  interesting  and  useful  results,  and,  ultimately,  to  practi¬ 
cal  implementation.  I  intend  to  pursue  this  goal  and  would  be  pleased  if  other 
researchers  find  fruitful  points  of  departure.  I  also  hope  that  readers  who  have 
worked  with  standard  versions  of  the  methods  described  in  this  thesis  will  gain  a 
better  understanding  of  when  and  how  to  use  them. 

It  has  been  my  good  fortune,  through  the  skillful  and  tireless  efforts  of  my 
advisor,  to  have  my  research  funded  by  various  grants  and  a  joint  project  with 
an  industrial  partner,  Northrop  Grumman  Corporation  (Electronic  Sensors  and 
Systems  Sector)  of  Linthicum,  Maryland.  This  project  has  afforded  me  the  invalu¬ 
able  experience  of  performing  research  toward  solving  “real-world”  manufacturing 
problems  from  which  I  have  benefited  greatly.  It  is  worthwhile  to  mention  that 
in  such  a  situation  the  objectives  of  the  parties  may  not  perfectly  coincide,  e.g., 


n 


enhanced  fundamental  understanding  of  physical-chemical  processes  versus  manu¬ 
facturing  support  leading  to  immediately  tangible  cost  reductions.  I  can  only  hope 
that  I  have  balanced  the  competing  pressures  in  a  way  that  is  somewhat  satisfying 
to  all  of  the  involved  parties. 

During  the  more  than  seven  years  I  spent  here  in  College  Park,  the  campus  and 
the  city  have  changed  and  improved  signficantly  in  many  ways.  I  consider  myself 
lucky  to  have  worked  here  during  the  presidency  of  Dr.  William  E.  Kirwan.  More¬ 
over,  the  ISR,  under  the  leadership  of  Dr.  Steven  Marcus  and  Dr.  Gary  Rubloff, 
and  the  ECE  Department,  under  the  leadership  of  Dr.  Nariman  Farvardin,  have 
been  wonderful  and  stimulating  places  to  work  over  this  period.  I  have  noticed  the 
results  of  their  efforts  on  a  daily  basis.  Moreover,  it  has  been  a  pleasure  to  work 
with  Dr.  David  Bader  and  Dr.  Raadhakrishnan  Poovendran  toward  the  establish¬ 
ment  of  a  thriving  ECE  graduate  student  association.  Finally,  it  is  a  good  thing 
that  College  Park  has  an  establishment  dedicated  to  “smoothies,  ”  as  they  kept  my 
energy  level  high  without  resorting  to  coffee  (the  beverage  of  choice,  it  seems,  for 
graduate  students). 

My  advisor  gave  me  all  sorts  of  advice  and  guidance  over  the  past  several  years 
on  a  variety  of  subjects.  Yet  there  was  a  common  theme  that  is  characterized  in  a 
statement  that  he  repeated  often.  It  is  one  that  I  will  remember  and  try  to  apply 
in  the  future. 

“Be  relentless  in  the  pursuit  of  knowledge.” 

— P.  S.  Krishnaprasad 

Andrew  Joseph  Newman 
College  Park,  Maryland 
December,  1999 
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Chapter  1 


Introduction 


This  thesis  consists  of  several  somewhat  distinct  bnt  connected  parts,  with  an  un¬ 
derlying  motivation  in  problems  pertaining  to  control  and  optimization  of  semicon¬ 
ductor  processing.  The  first  part  (Chapters  3  and  4)  addresses  problems  in  model 
reduction  for  nonlinear  state-space  control  systems.  The  problems  are  motivated 
by  a  thorough  discussion  and  analysis  of  prominent  state-of-the-art  approaches. 
We  then  offer  solutions  via  methods,  tools,  and  algorithms  for  computation  of  bal¬ 
anced  realizations,  both  in  general  and  motivated  by  specific  applications.  The 
second  part  (Chapter  5)  deals  with  modeling  of  rapid  thermal  chemical  vapor  de¬ 
position  (RTCVD)  for  growth  of  silicon  thin  dims,  via  first-principles  and  empirical 
analysis.  We  present  detailed  process-equipment  models  and  study  the  factors  that 
influence  deposition  uniformity  through  analysis  of  experimental  and  simulation 
results.  In  the  third  part  (Chapter  6)  we  continue  the  modeling  effort,  specializing 
to  a  control  system  for  RTCVD  heat  transfer.  We  then  develop  and  apply  ad-hoc 
versions  of  prominent  model  reduction  approaches  to  derive  reduced  models  and 
perform  a  comparative  study. 

In  this  introductory  chapter  we  provide  background  material,  an  overview  of  the 
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scope  and  contributions  of  this  thesis,  and  a  guide  to  its  organization  by  chapters. 


1.1  Background 

The  modeling  of  complex  dynamical  systems  is  one  of  the  most  important  sub¬ 
jects  in  science  and  engineering.  For  control  engineers,  the  subject  is  crucial,  since 
control  law  design  requires  the  formulation  of  a  suitable  mathematical  model  for 
the  system  of  interest.  For  many  systems,  the  underlying  physics  is  known  and 
physics-based  models  exist  whose  predictive  capability  has  been  demonstrated  ex¬ 
perimentally  and  is  well  established.  For  example,  the  Navier-Stokes  equations 
(see,  e.g.,  [61])  together  with  appropriate  boundary  conditions  and  initial  condi¬ 
tions  provide  a  reliable  mathematical  description  for  the  flow  of  a  Newtonian  fluid. 

It  is  often  the  case  that  a  model  is  too  complicated  to  be  useful  for  its  intended 
application.  Highly  complex  models  cause  difficulties  in  controller  synthesis  and 
may  place  excessive  computational  burdens  on  software  and  hardware  used  for 
simulation  and  control.  For  example,  the  difficulties  involved  with  the  design  of 
control  algorithms  using  a  nonlinear  partial  differential  equation  (PDE)  model 
such  as  Navier-Stokes  are  well  known  (e.g.,  [70,  99]).  One  remedy  is  to  make 
approximations  to  the  model,  based  on  physical  considerations  and  mathematical 
analysis,  in  order  to  derive  a  simpler  model  from  the  original  complicated  one. 
This  is  what  is  generally  referred  to  as  model  reduction. 

The  distinction  between  modeling  and  model  reduction  is  blurry  and  varies 
among  different  authors.  Verriest  [160]  defines  modeling  as  the  process  whereby 
an  abstract  mathematical  model  is  matched  to  the  physical  reality,  and  model 
reduction  the  process  whereby  a  simpler  mathematical  model  is  derived  from  an 
existing  mathematical  model.  This  notion  is  intuitive  but  we  mention  the  following 
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exceptions.  We  interpret  simplifications  based  solely  on  physical  considerations, 
such  as  eliminating  terms  describing  conductive  effects  in  a  heat  transfer  problem 
where  radiative  effects  are  dominant,  as  falling  within  the  realm  of  modeling,  and 
take  the  resulting  simpler  (but  still  complicated)  model  as  the  new  starting  point 
for  model  reduction.  Also,  there  are  situations  in  control  engineering  and  physics  in 
which  the  model  contains  redundancies  that  can  be  eliminated  through  mathemat¬ 
ical  analysis,  yielding  simplified  models  which  make  exactly  the  same,  rather  than 
approximate,  predictions  as  the  original.  Examples  include  non-minimal  linear 
state-space  models  (see  e.g.,  [72])  and  certain  conservative  systems  with  symme¬ 
tries  (e.g.,  [100]).  Again,  we  take  the  model  with  redundancies  already  eliminated 
as  the  starting  point  for  reduction. 

1.1.1  Model  Reduction 

Two  basic  attributes  of  a  mathematical  model  are  its  fidelity  and  complexity. 
Generally  speaking,  the  fidelity  of  a  model,  also  called  correctness,  refers  to  its 
capability  to  predict  the  behavior  of  the  system  being  modeled.  It  can  also  be 
thought  of  as  the  degree  to  which  characteristics  of  the  physical  system  are  reflected 
by  the  model.  The  complexity  of  a  model  is  given,  roughly,  by  the  number  of 
unknowns  that  must  be  determined  in  order  to  characterize  the  system  behavior. 
There  is  a  natural  trade-off  between  these  two  attributes.  Approximations  resulting 
in  a  complexity  reduction  necessarily  degrade  fidelity  and  vice-versa  (otherwise,  as 
stated  earlier,  the  original  model  is  an  unacceptable  starting  point). 

The  advantages  of  low  complexity  are  clear.  It  allows  for  an  easier  understand¬ 
ing  of  model  dynamics  and  simplification  of  controller  synthesis.  The  reduced  com¬ 
putational  burden  of  low  complexity  models  leads  to  faster  and  easier  computer 
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simulation,  faster  control  algorithms,  and  more  reliable  controller  implementations 
(whether  in  hardware  or  software)  since  there  are  fewer  sources  of  potential  fail¬ 
ure.  Again  referring  to  the  control  of  fluid  flows,  initial  successes  have  recently 
been  shown  toward  using  low  complexity  models,  derived  from  Navier- Stokes,  in 
the  development  of  control  algorithms  for  the  wall  region  of  a  turbulent  boundary 
layer  [99]. 

The  model  reduction  problem,  then,  is  one  of  finding  a  systematic  methodology 
within  a  given  mathematical  framework  to  produce  an  efficient  or  optimal  trade-off 
of  fidelity  versus  complexity.  By  efficient  and  optimal,  respectively,  we  mean  rela¬ 
tively  small  and  the  smallest  possible  degradation  in  fidelity  for  a  given  complexity 
reduction.  Thus,  the  procedure  should  quantify  the  effect  of  a  given  approxima¬ 
tion  on  fidelity  in  some  meaningful  way.  Guaranteed  error  bounds  are  desirable. 
It  is  also  useful  for  the  procedure  to  guarantee  the  preservation  of  properties  such 
as  open-loop  and  closed-loop  stability.  It  is  then  up  to  the  designer  to  choose 
the  degree  of  reduction  based  on  considerations  for  the  particular  application  of 
interest. 

This  thesis  deals  with  model  reduction  within  the  framework  of  continuous-time 
state-space  control  systems.  By  control  system  we  mean  a  dynamical  system  with 
exogenous  inputs  (e.g.,  controls,  disturbances)  and  outputs  (e.g.,  measurements, 
variables  of  interest).  The  dimension  of  a  state-space  model,  also  known  as  the 
model  order,  is  the  number,  possibly  infinite,  of  independent  variables  needed  to 
characterize  the  “state”  of  the  system,  which,  roughly  speaking,  represents  the 
memory  that  the  system  has  of  its  past.  These  variables  are  called  state  variables, 
or  state  components,  and  an  ordered  collection  of  all  of  them  is  called  the  system 
state.  The  set  of  allowable  values  for  the  state  is  called  the  state-space,  also 
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known  as  the  phase-space.  Definitions  and  a  mathematical  set-up  are  presented  in 
Section  2.1. 

If  there  are  a  finite  number  of  state  variables,  then  we  call  the  system  finite¬ 
dimensional.  Otherwise,  by  convention  the  system  order  is  set  to  infinity  and 
we  call  the  system  infinite-dimensional,  also  known  as  a  distributed  parameter 
system.  In  the  state-space  context,  complexity  is  equivalent  to  model  order.  Thus, 
it  is  clear  that  model  reduction  is  essential  in  the  infinite-dimensional  setting.  The 
original  (physics-based)  model  is  called  the  full-order  model,  while  approximations 
are  called  reduced-order  models. 

One  measure  of  fidelity,  i.e.,  the  quality  of  approximation,  is  given  by 

II y  ~ 

sup  — - — 

u&U  \\u 

where  y  represents  the  full-order  system  output,  yr  represents  the  reduced-order 
system  output,  u  represents  the  input  belonging  to  the  admissible  class  U,  and 
||  •  ||  denotes  an  appropriate  norm.  This  amounts  to  measuring  the  worst-case  error 
between  the  outputs  of  the  original  and  reduced  models  over  all  admissible  input 
signals.  Other  measures  of  fidelity  can  also  be  used  depending  upon  the  situation. 

The  general  methodology  for  state-space  model  reduction  involves  coordinate 
transformation  followed  by  component  truncation.  The  procedure  is  illustrated  in 
Figure  1.1.  The  state  can  be  expressed  in  terms  of  coordinates,  i.e.,  as  a  linear 
combination  of  basis  elements  for  the  state-space.  For  reduction  we  find  a  coordi¬ 
nate  system  in  which  each  state  component  is  ranked  according  to  its  contribution, 
or  importance,  to  the  relevant  (e.g.,  input-to-output)  system  behavior.  Then,  the 
system  evolution  equation  is  expressed  in  terms  of  the  new  coordinates,  and  state 
components  with  relatively  little  importance  are  deleted  from  the  model.  Integra¬ 
tion  of  the  reduced  evolution  equation  gives  the  trajectory  of  the  reduced-order 


(1.1) 


5 


Coordinate  Transformed 


Figure  1.1:  General  state-space  model  reduction  methodology. 


state.  Finally,  an  approximation  to  the  original  full-order  state  is  reconstructed 
from  the  reduced-order  state.  The  choice  of  coordinate  transformation  is  what 
generally  distinguishes  different  methods  and  is  the  key  to  achieving  efficiency. 

In  earlier  practice,  model  approximation  has  been  largely  based  on  heuristics 
and  ad-hoc  trial-and-error  methods.  However,  over  the  past  few  decades,  model 
reduction  on  a  solid  mathematical  basis  has  been  the  subject  of  extensive  research 
from  a  broad  range  of  viewpoints  and  over  a  large  number  of  application  areas. 
This  research  has  resulted  in  a  variety  of  reduction  tools. 

One  tool  that  has  found  much  recent  application  in  simplification  of  models 
for  fluid  flow,  especially  in  the  area  of  turbulence,  is  the  proper  orthogonal  decom¬ 
position  (POD)  (see,  e.g.,  [15,  67,  96,  147]),  also  known  as  the  Karhunen-Loeve 
expansion  (see,  e.g.,  [126,  166])  of  a  second-order  stochastic  process.  The  POD 
can  be  described,  roughly,  as  a  procedure  for  extracting  a  basis  for  an  orthogonal 
decomposition  of  the  state-space  from  an  ensemble  of  signals.  In  the  context  of 
model  reduction  for  dynamical  systems,  the  ensemble  must  capture,  or  represent, 
the  relevant  system  behavior.  The  procedure  is  attractive  for  several  reasons.  It 
is  applicable  to  linear  and  nonlinear  models,  and  to  models  of  finite  and  infinite 
dimension.  It  provides  a  meaningful  ranking  of  state  components  from  an  energy 
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contribution  viewpoint,  and  enjoys  properties  such  as  optimality  (see  e.g.,  [15])  in 
the  sense  of  data  compression  and  error  minimization.  The  POD  is  a  tool  of  major 
importance  to  this  thesis  and  is  described  in  detail  in  Section  3.2. 

The  POD  has  been  independently  rediscovered  and  analyzed  from  different 
points  of  view  several  times  since  the  1940’s  (see  [15]  for  a  brief  history).  It  has 
been  applied  in  a  variety  of  areas  including  image  processing  (e.g.,  [146]),  modeling 
and  control  of  chemical  processes  (e.g.,  [25,  52]),  and  turbulence  modeling  (e.g.,  [11, 
12,  149]).  While  the  properties  of  the  POD  are  well  known  and  useful  from  the 
point  of  view  of  reducing  the  dimension  of  a  single  ordinary  or  partial  differential 
equation  model,  the  control  viewpoint  introduces  new  issues.  By  letting  controls 
take  values  in  a  suitable  function  space,  a  family  of  ordinary  or  partial  differential 
equations  is  obtained.  From  the  empirical  perspective,  it  becomes  unclear  how 
to  generate  a  representative  data  ensemble,  since  the  system  response  depends 
strongly  on  the  chosen  input  signal.  The  ranking  of  states,  and  properties  such 
as  optimality,  lose  their  precise  meaning.  Furthermore,  the  relationship  between 
states  and  outputs  is  ignored  in  determining  the  POD  basis. 

Nevertheless,  during  the  1990’s,  the  POD  has  been  prominent  as  a  tool  for 
model  reduction  of  state-space  control  systems,  particularly  in  the  area  of  temper¬ 
ature  control  for  rapid  thermal  processing  (RTP)  (e.g.,  [1,  5,  6,  13,  120,  157]),  a 
process  used  for  several  functions  involved  in  manufacturing  semiconductor  devices 
(see  Section  1.1.2).  In  control  system  applications  of  POD,  mathematical  rigor  is 
replaced  by  ad-hoc  procedures.  For  example,  in  [13]  the  authors  generate  several 
reduced  models  for  an  RTP  system,  each  corresponding  to  a  different  operating 
point,  and  switch  among  them  via  heuristic  rules  according  to  the  state  trajectory. 
Thus,  the  method  can  be  effective  as  part  of  an  overall  ad-hoc  procedure,  but  not 
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necessarily  satisfying  from  a  control-theoretic  point  of  view. 

The  basis  elements  generated  by  the  POD  procedure  are  often  referred  to  as 
principal  components  (see  Section  2.3)  or  empirically  determined  eigenfunctions, 
because  they  are  extracted  from  an  empirically  generated  ensemble  of  signals. 
There  are  other  model  reduction  procedures  that  use  a  fixed,  rather  than  empirical, 
basis  for  an  orthogonal  decomposition  of  the  state-space.  For  example,  wavelet 
bases  (see,  e.g.,  [35])  have  recently  been  used  in  model  approximation  for  the 
control  of  heat  diffusion  and  vibration  damping  in  a  visco-thermoelastic  rod  [20]. 
However,  these  methods  do  not  take  advantage  of  existing  physical  or  empirical 
knowledge  of  the  system  in  choosing  the  basis.  Consequently,  there  is  little  that 
can  be  said  about  their  effectiveness  in  general  situations. 

Considerable  work  has  been  devoted  to  model  reduction  for  finite-dimensional 
linear  time-invariant  (LTI)  control  systems,  dating  back  to  the  1960s  (see  [53] 
for  a  complete  list  of  references  through  1976).  These  efforts  generally  fall  into 
the  categories  of  polynomial  approximations  in  the  frequency-domain,  state-space 
transformation  and  component  truncation  in  the  time-domain,  and  parametric 
optimization  techniques  (see  [48]  for  a  complete  overview).  Some  of  these  are  mo¬ 
tivated  by  and  designed  for  a  particular  application.  For  example,  modal  analysis 
(see,  e.g.,  [103])  is  mainly  used  as  a  tool  for  reducing  the  complexity  of  linear  lightly 
damped  mechanical  systems  (e.g.,  [26]). 

An  LTI  state-space  method  of  general  importance  and  applicability  is  balanc¬ 
ing,  introduced  by  Moore  [109]  in  1981  for  stable,  minimal,  finite-dimensional  LTI 
systems.  In  this  method,  a  system  is  transformed  to  balanced  form,  which  means 
that  it  is  “equally  controllable  and  observable.”  The  states  of  a  balanced  realiza¬ 
tion  can  be  ranked  according  to  their  influence  on  the  input-to-output  behavior  of 


the  system,  as  measured  by  its  input-to-output  gain,  or  Hankel  norm.  For  LTI  sys¬ 
tems,  balancing  is  strongly  related  to  the  POD,  in  the  sense  that  the  basis  for  the 
balancing  coordinate  transformation  can  be  derived  using  principal  components 
generated  via  injection  of  impulsive  inputs.  We  elaborate  on  balancing  for  LTI 
systems  in  Section  3.3. 

During  the  1980s  and  1990s,  various  versions  of  balancing  and  other  Hankel- 
norm  based  methods  (e.g.,  [34,  54,  113])  were  developed  for  finite-  and  infinite¬ 
dimensional  LTI  systems.  Concurrently,  there  was  a  substantial  effort  toward 
development  of  algorithms  and  computational  tools  (e.g.,  [91,  136])  for  practical 
implementation  of  linear  balancing,  resulting  in  its  wide  application  to  produce 
low-order  models  for  LTI  control  systems.  The  main  objects  of  importance  in 
linear  balancing  are  the  controllability  and  observability  Gramian  matrices.  Thus, 
the  computational  tools  are  based  mainly  on  well  known  and  efficient  algorithms 
for  matrix  algebra  problems. 

Scherpen  [140,  141]  extended  the  balancing  approach,  introducing  in  1993  a 
general  theory  and  procedure  of  balancing  for  a  class  of  stable,  affine,  smooth, 
finite-dimensional  nonlinear  systems.  The  main  objects  of  importance  in  non¬ 
linear  balancing  are  the  controllability  and  observability  energy  functions.  The 
balancing  coordinate  transformation  is  local  to  a  neighborhood  of  the  origin  and 
determined  by  application  of  the  Morse-Palais  lemma,  which  gives  a  canonical  form 
for  functions  in  the  neighborhood  of  a  non-degenerate  critical  point.  Scherpen’s 
nonlinear  balancing  procedure  forms  a  major  part  of  the  foundation  for  this  thesis, 
and  is  detailed  throughout  Chapter  4. 

In  contrast  to  the  linear  case,  the  nonlinear  balancing  procedure  is  not  immedi¬ 
ately  amenable  to  computational  implementation.  For  example,  the  controllability 
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energy  function  corresponds  to  the  value  function  for  a  nonlinear  optimal  control 
problem.  Also,  the  Morse-Palais  lemma  guarantees  the  existence  of  a  transfor¬ 
mation  to  a  canonical  form  for  the  controllability  energy  function,  but  provides 
no  constructive  procedure  for  obtaining  it.  Thus,  tools  have  not  yet  appeared  for 
computing  balanced  realizations  for  nonlinear  systems,  and  the  procedure  has  not 
yet  been  applied  as  a  tool  for  model  reduction. 

In  this  thesis  we  offer  methods  for  computing  balanced  realizations  for  nonlinear 
systems.  We  rely  heavily  on  the  theory  of  stochastically  excited  dynamical  systems, 
i.e.,  control  systems  with  Gaussian  white  noise  injected  at  the  input  terminals.  The 
state  of  such  a  system  is  a  stochastic  process  with  an  associated  density  function. 
The  evolution  of  the  state  is  governed  by  a  stochastic  differential  equation,  and 
the  evolution  of  the  density  function  is  governed  by  a  pair  of  hypoelliptic  diffusion 
equations.  In  the  case  of  a  linear  system,  the  covariance  matrix  of  the  steady- 
state  density  is  equal  to  the  controllability  Gramian  matrix  of  the  corresponding 
deterministic  system,  a  fact  which  motivates  useful  generalizations  to  the  nonlinear 
setting.  Mathematical  preliminaries  for  dealing  with  stochastically  excited  systems 
are  presented  in  Section  2.6. 

The  model  reduction  tools  that  we  develop  have  general  applicability  but  be¬ 
come  impractical  for  systems  of  sufficiently  high  dimension.  However,  for  certain 
specific  types  of  systems,  we  can  obtain  computable  results,  e.g.,  an  exact  formula 
for  the  controllability  function.  We  study  the  class  of  second-order  mechanical 
systems  characterized  by  a  Hamiltonian  (conservative)  structure  and  perturbed  by 
dissipation  and  forcing.  Under  certain  conditions,  steady-state  densities  can  be 
obtained  for  these  systems  from  which  the  controllability  energy  function  can  be 
derived. 
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1.1.2  Modeling:  Rapid  Thermal  CVD  for  Silicon  Growth 


In  recent  years,  the  semiconductor  industry  has  begun  to  employ  mathematical  and 
computational  models  as  tools  to  aid  in  equipment  design  (e.g.,  [29,  49,  155]),  sim¬ 
ulation  (e.g.,  [84,  98]),  process  optimization  (e.g.,  [135]),  and  model-based  process 
control  (e.g.,  [30,  139,  63]).  These  modeling  efforts  have  been  motivated  primarily 
by  gains  in  manufacturing  cost  effectiveness.  In  particular,  the  design  of  new  equip¬ 
ment  and  manufacturing  processes  is  typically  performed  via  costly  trial-and-error 
procedures.  Modeling  and  simulation  are  used  to  reduce  the  number  of  required 
experiments,  thus  reducing  the  associated  cost  and  development  cycle  time.  Fur¬ 
thermore,  as  device  sizes  shrink,  and  as  wafer  sizes  grow,  process  control  becomes 
more  challenging  and  specifications  become  tighter.  In  many  cases,  model-based 
process  control  is  needed  to  accomplish  the  task. 

One  semiconductor  manufacturing  process  that  has  been  the  subject  of  much 
recent  modeling  activity  from  various  points  of  view  is  chemical  vapor  deposition 
(CVD)  (see  e.g.,  [75,  133,  144]),  a  common  way  of  depositing  thin  layers  of  con¬ 
ducting  and  insulating  films  over  the  surface  of  a  semiconductor  wafer.  In  CVD, 
the  material  is  deposited  from  a  gaseous  precursor  to  the  substrate  via  chemical 
reactions  that  are  activated  by  heat  energy.  In  this  thesis,  we  are  concerned  with 
dynamic  and  steady-state  models  of  CVD  for  growth  of  silicon  thin  films  on  a  sili¬ 
con  wafer.  In  particular,  we  focus  on  rapid  thermal  CVD,  i.e.,  CVD  processes  that 
employ  RTP  technology  for  wafer  heating.  The  process  is  illustrated  in  Figure  1.2. 

RTP  (see  e.g.,  [18,  132])  is  a  technology  for  rapidly  heating  and  cooling  a  single 
semiconductor  wafer,  allowing  manufacturing  processes  to  achieve  high  tempera¬ 
tures  for  short  (e.g.,  5  seconds),  well-controlled  periods  of  time.  The  wafer  is  usually 
heated  by  energy  radiated  from  specially  designed  high-power  lamps.  A  tempo- 
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Figure  1.2:  Simplified  illustration  of  rapid  thermal  CVD. 


rally  varying  temperature  profile  is  programmed  to  achieve  the  desired  processing 
step.  RTP  technology  is  versatile;  its  capabilities  have  been  used  in  several  wafer 
processing  functions  including  annealing,  CVD,  oxidation,  nitridation,  and  contact 
sintering.  Several  different  designs  exist  for  RTCVD  equipment  (e.g.,  [27,  80,  155]). 

Precise  control  of  the  temperature  distribution  across  the  wafer  surface  during 
RTP  is  critical  to  achieving  a  uniform  film  thickness  across  the  wafer  surface, 
ensuring  reproducibility  from  wafer  to  wafer,  and  minimizing  thermal  stress  on 
the  wafer  during  processing.  Even  small  temperature  variations  (non-uniformities) 
can  cause  large  thickness  variations,  resulting  in  costly  reductions  in  process  yield. 
Various  control  strategies  have  been  tried  recently,  including  Run-to-Run  (RtR) 
control  (e.g.,  [68,  86,  168])  and  model-based  feedback  control  (e.g.,  [63,  62,  39]). 

The  modeling  and  analysis  of  CVD  equipment  and  processes  presented  in  this 
thesis  are  mainly  the  result  of  a  joint  project  [116,  117,  115]  between  the  Institute 
for  Systems  Research  (ISR)  of  the  University  of  Maryland,  College  Park,  and  the 
Electronic  Sensors  and  Systems  Sector  of  Northrop  Grumman  Corporation  (NG- 
ESSS),  Baltimore,  MD,  undertaken  during  1997  and  1998.  The  overall  objective 
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of  the  project  was  to  improve  manufacturing  effectiveness  for  epitaxial  growth 
of  silicon  and  silicon-germanium  (Si-Ge)  thin  films  on  a  silicon  wafer.  Epitaxial 
growth,  or  epitaxy,  (see,  e.g.,  [134,  162])  refers  to  the  deposition  of  a  thin  layer 
of  material  onto  the  surface  of  a  single-crystal  substrate  in  such  a  manner  that 
the  layer  is  also  single-crystal  and  has  a  fixed  and  predetermined  crystallographic 
orientation  with  respect  to  the  substrate. 

The  equipment  used  at  NG-ESSS  to  deposit  the  thin  Elms  (and  currently  a 
production  tool  in  use  for  various  processes)  was  the  Epsilon- 1  RTCVD  reactor, 
manufactured  by  ASM  America,  Phoenix,  AZ.  NG-ESSS  uses  the  Epsilon- 1  to  de¬ 
posit  both  poly-crystalline  and  epitaxial  (single-crystal)  layers  of  silicon,  depending 
on  the  application.  Silicon  epitaxy  provides  flexibility  for  a  device  designer  to  tai¬ 
lor  or  optimize  the  device  performance  by  allowing  for  greater  control  of  doping 
concentration  and  profile  in  deposited  layers.  Si-Ge  Elms  are  always  epitaxial. 
Chapter  5  contains  details  regarding  the  CVD  equipment  and  processes  involved 
in  our  modeling  eSort. 

One  important  modeling  objective  is  the  prediction  of  deposition  rates  and 
thickness  uniformity  given  operating  conditions  such  as  flow  rates  of  process  gases, 
wafer  temperature,  and  chamber  pressure.  Therefore,  the  process-equipment  model 
for  silicon  growth  in  the  Epsilon-1  describes  the  flow  of  process  gases  through  the 
chamber,  heat  transfer  in  the  gas  phase  and  within  and  among  the  various  solids  in 
the  chamber  including  the  wafer,  the  transport  of  chemical  species  within  a  mul¬ 
ticomponent  gas,  and  chemical  mechanisms  for  gas  phase  and  surface  reactions. 
It  takes  the  form  of  a  set  of  coupled  nonlinear  PDEs  and  associated  boundary 
conditions  together  with  chemical  kinetics  equations.  Moreover,  we  note  that  due 
to  the  asymmetrical  design  of  the  Epsilon- 1  deposition  chamber  and  heat  lamp  ap- 
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paratus,  models  incorporating  three  spatial  dimensions  are  required  to  sufficiently 
describe  the  various  phenomena  (approximations  incorporating  one  or  two  spatial 
dimensions  via  symmetries  are  useful  in  certain  situations).  Discretization  (e.g., 
hnite- volumes,  finite-elements)  at  a  suitable  resolution  results  in  a  model  with 
thousands  of  states. 

Due  to  the  scale  and  scope  of  the  overall  process-equipment  model,  it  is  ad¬ 
vantageous  to  develop  a  separate  model  that  focuses  specifically  on  heat  transfer 
among  the  various  solid  surfaces  in  the  RTP  chamber  including  the  semiconductor 
wafer.  Such  models  are  often  used  in  model-based  control  strategies  for  achieving 
temperature  uniformity  across  the  wafer  surface.  However,  as  we  demonstrate  in 
Chapter  5,  temperature  uniformity  does  not  necessarily  ensure  deposition  thickness 
uniformity. 

Nevertheless,  in  Chapter  6,  we  develop  such  a  RTP  heat  transfer  model  pertain¬ 
ing  specifically  to  the  Epsilon-1.  Physics-based  models  for  RTP  heat  transfer  give 
a  distributed  parameter  control  system  with  a  radiative  (4-th  power)  nonlinearity 
in  the  governing  equations  and  boundary  conditions.  Low-order  approximations 
to  the  model  are  desired  in  order  to  facilitate  control  law  design,  model-based 
feedback  control  implementation,  and  computer  simulation.  We  derive  low-order 
models  for  the  RTP  heat  transfer  control  system  using  the  reduction  approaches 
described  in  this  thesis,  and  compare  their  relative  merits  and  drawbacks. 

1.2  Scope  and  Contributions 

We  list  here  the  main  contributions  of  this  thesis. 
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•  We  develop  useful  methods,  tools,  and  algorithms  to  compute  the  energy 
functions  and  coordinate  transformations  involved  in  the  Scherpen  theory 
and  procedure  for  nonlinear  balancing.  We  apply  our  approach  to  derive,  for 
the  first  time,  balanced  representations  of  nonlinear  state-space  models. 

•  We  offer  a  new  method  involving  stochastic  excitation  for  approximating  the 
controllability  energy  function  of  a  nonlinear  system. 

•  We  determine  conditions  under  which  an  exact  formula  can  be  written  for  the 
controllability  energy  function  of  a  nonlinear  Hamiltonian  system  perturbed 
by  dissipation  and  forcing.  We  apply  our  result  to  provide  an  expression  for 
the  controllability  function  of  a  4-state  nonlinear  mechanical  system. 

•  We  present  an  algorithm  for  a  numerical  implementation  of  the  Morse-Palais 
lemma,  i.e.,  computation  of  a  local  coordinate  transformation  under  which 
a  real-valued  function  with  a  non-degenerate  critical  point  is  quadratic  on  a 
neighborhood  of  the  critical  point. 

•  We  develop  a  collection  of  programs  and  utilities  in  a  standard  program¬ 
ming  language  to  facilitate  the  practical  application  of  our  methods  and 
algorithms. 

•  We  develop  a  high-fidelity  process-equipment  model  for  deposition  of  silicon 
thin  dims  in  a  commercial  rapid  thermal  CVD  reactor.  The  model  allows 
us  to  simulate  growth  experiments  under  a  broad  range  of  process  condi¬ 
tions  while  taking  account  of  the  various  physical  and  chemical  phenomena 
involved  in  CVD  of  silicon  from  a  multi-component  gas. 
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•  We  investigate  the  factors  that  influence  deposition  rate  and  uniformity  in  a 
commercial  reactor  including  the  effects  of  temperature,  precursor  gas  flow 
rates,  and  chamber  pressure.  We  determine  relationships  between  the  various 
factors  and  growth  rate  that  can  be  used  to  predict  their  effects  in  a  particular 
situation. 

•  We  demonstrate  through  anecdotal  evidence,  simulation  results,  and  experi¬ 
mental  data  that  achieving  deposition  thickness  uniformity  requires  a  certain 
degree  of  temperature  non-uniformity  across  the  wafer  surface. 

•  We  apply  view  factor  methods  to  develop  a  radiative  heat  transfer  model 
for  the  heating  of  a  semiconductor  wafer  via  tungsten-halogen  lamps  in 
a  commercial  RTP  reactor.  The  model  incorporates  a  non-symmetric  3- 
dimensional  chamber  and  lamp  array  geometry,  a  feature  not  commonly 
found  in  the  literature.  Furthermore,  the  model  is  partially  validated  through 
an  ad-hoc  experimental  procedure. 

•  We  formulate  ad-hoc  procedures  for  applying  standard  reduction  methodolo¬ 
gies  to  physics-based  models  for  RTP  heat  transfer.  We  apply  the  procedures 
to  a  high-order  control  system  model  for  heat  transfer  in  a  commercial  RTP 
chamber  to  derive  low-order  model  approximations  that  faithfully  reproduce 
the  relevant  input-to-output  behavior  of  the  original  model. 

•  We  provide  a  guide  to  the  use  of  standard  and  ad-hoc  model  reduction  ap¬ 
proaches  that  does  not  sacrifice  rigor  while  serving  as  a  practical  tool  that 
emphasizes  computational  issues  and  potential  hazards.  In  the  process  we 
illuminate  important  connections  between  prominent  methods. 
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We  list  here  some  contributions  of  this  thesis  that  either  support  the  main  body 
of  work  or  are  necessary  for  completeness  of  the  exposition  but  are  not  of  primary 
interest  or  importance. 

•  We  present  a  new  proof  of  a  theorem  by  Scherpen  that  appeals  to  the  con¬ 
nections  between  the  result  and  optimal  control  theory. 

•  We  provide  new  and  more  general  conditions  on  the  output  map  of  a  nonlin¬ 
ear  system  such  that  the  observability  energy  function  exists,  i.e.,  is  finite. 

•  We  bring  to  light  and  illustrate  through  examples  issues  of  non- uniqueness 
regarding  the  Morse  coordinate  transformation  and  balancing  transforma¬ 
tion. 

•  We  demonstrate  that  the  consumption  of  process  gases  in  the  Epsilon-1 
RTCVD  reactor  can  be  reduced  by  decreasing  the  purge  gas  flow. 

1.3  Thesis  Outline 

The  material  presented  in  this  thesis  is  reasonably  self  contained.  It  is  organized 
by  chapters  as  follows. 

Chapter  2  We  provide  the  mathematical  preliminaries  necessary  for  working  with 
the  main  topics  of  this  thesis. 

Chapter  3  We  introduce  two  prominent  model  reduction  approaches:  POD  and 
balanced  truncation.  We  describe  the  current  state-of-the-art,  including  the 
underlying  theory,  computational  issues,  advantages  and  shortcomings,  and 
selected  applications.  The  material  in  this  chapter  motivates  the  research 
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in  Chapter  4  and  explains  the  methods  and  computational  tools  used  in 
Chapter  6. 

Chapter  4  We  address  the  problem  of  computability  pertaining  to  the  Scherpen 
theory  and  procedure  for  balancing  of  nonlinear  systems.  We  offer  methods 
and  algorithms  toward  balancing  stable  affine  nonlinear  control  systems,  with 
some  emphasis  on  computation  of  the  controllability  energy  function  and 
the  Morse  coordinate  transformation  of  a  function  around  a  non-degenerate 
critical  point. 

Chapter  5  We  develop  high-fidelity  physical- chemical  models  for  predicting  the 
behavior  and  output  of  a  commercial  RTCVD  reactor  used  for  depositing 
thin  dims  of  Si  and  Si-Ge  on  silicon  wafers  in  a  manufacturing  environment. 
We  present  the  results  of  simulations  and  growth  experiments  and  use  them 
to  study  the  factors  that  influence  deposition  rate  and  uniformity  in  the 
reactor. 

Chapter  6  We  formulate  a  physical  model  describing  heat  transfer  in  a  commer¬ 
cial  RTCVD  reactor.  We  then  derive  low-order  models  for  the  resulting  RTP 
heat  transfer  control  system,  using  ad-hoc  versions  of  methods  described  in 
Chapter  3. 

Chapter  7  We  present  concluding  remarks  and  comments  on  future  research  op¬ 
portunities. 

Appendices  The  Appendices  contain  supporting  material  that  is  essential  for 
completeness  but  that  would  be  disruptive  within  the  main  exposition. 
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Chapter  2 


Preliminaries 


This  thesis  makes  use  of  tools,  and  draws  concepts  and  ideas,  from  several  different 
areas  of  science  and  mathematics.  Here  we  collect  the  basic  definitions  and  results 
so  that  they  may  be  used  without  any  detailed  explanation  later  in  the  thesis. 
Topics  are  covered  in  additional  depth  in  the  listed  references. 

2.1  State-Space  Control  Systems 

This  thesis  deals  with  model  reduction  of  continuous-time  state-space  control  sys¬ 
tems.  We  focus  on  methods  and  algorithms  for  reduction  of  finite-dimensional 
models.  The  mathematical  framework  for  these  models  is  that  of  ordinary  differ¬ 
ential  equations  (ODEs)  for  the  state  evolving  on  a  smooth  manifold  and  repre¬ 
sented  in  terms  of  a  local  coordinate  system.  The  necessary  machinery  for  working 
with  manifolds  and  local  coordinates  is  set  up  in  Appendix  B.  The  reduction  ap¬ 
proaches  that  we  consider  require  some  elements  of  system  theory  (continuous-time 
finite-dimensional)  which  appear  in  Section  2.2. 

Some  of  the  modeling  and  reduction  methods  are  directly  applicable  in  the 
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infinite-dimensional  setting.  However,  for  purposes  of  this  thesis,  we  generally 
assume  that  an  infinite-dimensional  model  has  been  suitably  discretized  and  work 
with  the  finite-dimensional  approximation. 

In  the  finite-dimensional  case,  the  state-space  control  system  model  is  given 
by  a  pair  of  equations,  the  state  equation  and  the  output  equation,  respectively, 
describing  the  evolution  of  the  state  on  a  smooth  manifold  given  a  specified  in¬ 
put,  and  the  relationship  between  the  state  and  output.  Here  we  present  these 
equations  in  their  most  general  form,  followed  by  some  important  specializations. 
An  important  component  of  the  state-space  model  reduction  procedure  is  coordi¬ 
nate  transformation.  For  this  reason,  we  describe  what  happens  to  the  evolution 
equations  under  diffeomorphic  change  of  coordinates. 

The  material  contained  in  this  section  is  standard.  Our  treatment  is  based  on 
texts  by  Khalil  [79],  Nijmeijer  and  van  der  Schaft  [121],  and  Isidori  [69].  For  proofs 
we  refer  to  the  literature. 

System  Equations 

The  state-space  is  assumed  to  be  an  n-dimensional  smooth  manifold  M.  The 
finite-dimensional  control  system  evolving  on  M  is  given  by  the  equations 


x(t)  =  f  (t,x(t),u(t)) 

(2.1) 

y{t)  =  h(t,x(t),u(t)) 

(2.2) 

where  x  =  (aq, . . . ,  xn)  G  Rn  denotes  local  coordinates  for  the  state , 
u  =  (ui, ,  urn)  G  U  C  Rm  denotes  the  input  (control),  and  y  =  (yi,  ■  ■  ■  ,yp)  G  Rp 
denotes  the  output.  The  maps  /  and  h  are  to  be  interpreted  as  their  respective 
corresponding  local  representatives.  The  map  /  :  R+  x  Rn  x  Rm  — >  Rn  is  called 
the  system  map.  The  map  h  :  R+  x  Rn  x  Rm  — »  Rp  is  called  the  output  map. 
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We  often  make  assumptions  about  the  regularity  of  /  and  h,  e.g.,  that  they  are  of 
class  Ck  or  possibly  smooth.  The  ordinary  differential  equation  (2.1)  is  called  the 
state  equation  and  governs  the  time  evolution  of  the  state  given  a  specified  input 
and  initial  state.  Equation  (2.2)  is  called  the  output  equation.  These  equations  are 
written  in  vector  notation,  shorthand  for 


and 


Xi{t)  =  /l(t,Xi(t),...,Xn(t),Mi(t),...,MTO(t)) 


Xnit)  =  fn{t,Xi(t),...,Xn(t),U1(t),...,Um(t)) 


(2.3) 


Vi(t)  =  h1(t,x1(t),...,xn(t),u1(t),...,um(t)) 


yp(t)  =  /ip(f,aq(f), . . . ,  xn(t) ,Ui(t) , . . .  ,um(t)) 


(2.4) 


Each  input  signal  u  belongs  to  the  class  U  of  admissible  controls ,  which  we  take  as 
the  set 

U  =  {u  :  R+  ->  U  C  Rm  :  u  e  C00}  (2.5) 

or  sometimes  more  generally 

U  =  {  u  :  R+  — >  U  C  Rm  :  u  is  piecewise  continuous  from  the  right  j  (2.6) 

Given  a  specified  input  u  and  initial  state  x0,  the  time  evolution  of  the  state  is 
given  by  the  initial  value  problem 

x(t)  =  f{t,x(t))  x(t0)  =  x0  (2.7) 


where 

f(t,x(t))  =  f  (t,x(t),u(t)) 
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In  order  for  (2.7)  to  predict  the  state  trajectory,  it  must  have  a  unique  solution. 
This  can  be  guaranteed  under  conditions  given  by  the  following  result. 


Theorem  2.1.1  (Local  Existence  and  Uniqueness)  Let  f(t,x )  be  piecewise 
continuous  in  t  and  satisfy  the  Lipschitz  condition 


f(t,x)  -  f(t,y ) 


<  L  ||  x  —  y 


(2.8) 


for  all  x,y  G  B  =  {x  G  R"  :  ||  x  —  xq  ||  <  r}  and  for  all  t  e  [f0,£i].  Then,  there 
exists  some  S  >  0  such  that  the  initial  value  problem  (2.1)  has  a  unique  solution 
over  [t0,t0  +  5}.  □ 


Remark  2.1.2  The  unique  solution  of  (2.7)  on  [to,  to  +  <$],  if  it  exists,  is  given  by 
x(t)  —  xQ+  [  f  (s,x(s))ds  te[t0,t0  +  5]  (2.9) 

Jto 

It  is  referred  to  as  the  state  trajectory  and  sometimes  denoted  x  (t,  x0,  t0,  u)  to 
explicitly  indicate  the  initial  state,  initial  time,  and  specified  input.  The  corre¬ 
sponding  output  y  (t)  is  referred  to  as  the  output  trajectory.  □ 


Remark  2.1.3  The  Lipschitz  property  is  weaker  than  continuous  differentiability. 
In  this  thesis,  we  usually  work  with  f  and  u  that  are  smooth  in  their  respective 
arguments.  Thus,  we  generally  assume  local  existence  and  uniqueness  of  solutions 
for  the  systems  that  we  consider,  unless  specified  otherwise.  □ 


A  point  x  for  which  f(-,x,  0)  =  0  is  called  an  equilibrium.  If  a  system  has  an 
equilibrium,  then,  without  loss  of  generality,  we  assume  that  it  is  at  x  =  0,  i.e. , 
/(•,  0,  0)  =  0,  unless  otherwise  specified.  We  also  assume  that  h(-,  0,  0)  =  0  so  that 
the  output  is  zero  whenever  the  state  is  zero. 
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We  focus  our  attention  on  state-space  models  for  which  the  functions  /  and  h 


do  not  depend  explicitly  on  t,  i.e., 


x(t)  =  f(x(t),u(t )) 

(2.10) 

y(t )  =  h(x(t),u{t)) 

(2.11) 

With  a  specified  input  u,  these  models  are  referred  to  as  autonomous  or  time- 
invariant. 

A  particular  class  of  systems  that  we  consider  in  this  thesis  is  the  class  of 
autonomous  affine  systems 

m 

x(t)  =  f  (x(t)) +  J2  gi(x(t))  uft)  (2.12) 

i= 1 

y{t)  =  h(x(t ))  (2.13) 

in  which  the  input  enters  the  state  equation  in  an  affine  way,  and  there  is  no  direct 
feedthrough  of  the  input  to  the  output.  The  maps  /  :  Rn  — >  Rn,  g,  :  Rn  — »  Rn, 
i  G  m,  and  h  :  Rn  — *  Rp  are  to  be  interpreted  as  local  representatives.  As  before, 
we  assume,  without  loss  of  generality,  the  existence  of  an  equilibrium  at  x  =  0, 
i.e.,  /( 0)  =  0,  as  well  as  /i(0)  =  0. 

Remark  2.1.4  Sometimes  we  use  the  notation  gij  which  refers  to  the  i-th  compo¬ 
nent  of  the  j-th  input  function,  i.e.,  gt]  =  ( gf)v  □ 

Coordinate  Transformations 

Model  reduction  involves  smooth  coordinate  transformations  of  the  state-space. 
Let  {ei, . . . ,  en }  denote  the  standard  basis  for  R",  i.e.,  e«  is  a  vector  with  a  1  in 
the  i-th  position  and  a  0  in  every  other  position.  We  assume  that  the  functions 
/,  g ,  and  h  have  been  formulated  with  respect  to  the  standard  coordinate  system, 
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i.e.,  the  basis  for  the  original  local  coordinate  system  is  the  standard  basis,  so  that 
in  terms  of  local  coordinates  the  state  vector  can  be  written 

n 

x  =  X]  Xi  e-i 

i= 1 

We  now  consider  what  happens  to  Equations  (2.12)  and  (2.13)  under  coordinate 
transformation.  Let  U  and  V  be  subsets  of  Rn  containing  0.  Let  S  :  U  — »  V  be 
a  diffeomorphism  such  that  S'(O)  =0  (to  preserve  the  equilibrium  at  0).  The  fact 
that  S'  is  a  diffeomorphism  allows  for  reversing  the  transformation  and  recovering 
the  original  state,  and  guarantees  that  the  system  in  the  new  coordinates  is  still 
smooth.  We  call  S  a  smooth  local  coordinate  transformation  about  the  origin. 
Under  the  smooth  local  coordinate  transformation 

z  i->-  x  —  S(z )  (2-14) 

the  control  system  (2. 12)- (2. 13)  transforms  to 

m 

z{t)  =  /(*(*))  +  £&(*(*))“*(*)  (2-15) 

i= 1 

y{t)  =  h(z(t ))  (2.16) 

where 

f(z)  =  [DS(z)]-1  f  (S(z)) 

9i  (z)  =  [£>S(2)]_1</i(S(z))  i€m 
h(z)  =  h(S(z)) 

Linear  Time-Invariant  Systems 

It  will  be  useful  on  occasion  to  consider  the  special  case  of  a  linear  time-invariant 
(LTI)  system.  The  LTI  specialization  of  (2.12)-(2.13)  takes  the  form 

x  =  Ax  +  Bu  (2-17) 
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y 


Cx 


(2.18) 


where  A  is  n  x  n,  B  is  n  x  m,  and  C  is  p  x  n. 

In  the  LTI  case,  the  state-space  manifold  M  is  equal  to  ]Rn.  The  unique  global 
solution  of  (2.17)  with  initial  state  x(0 )  =  x0  is  given  by  the  variation  of  constants 
formula 

x  (t)  =  exp  (A  t)  x0  +  [  exp  (A(t  —  s))  B  u(s)  ds  (2-19) 

Jo 

A  coordinate  transformation  is  global  and  linear,  represented  by  an  invertible  trans¬ 
formation  matrix.  Let  S  be  the  transformation  matrix.  Under  the  linear  change 
of  coordinates 

z  e- »  x  =  S  z  (2.20) 

the  LTI  control  system  (2.17)-(2.18)  transforms  to 

z  =  Az  +  Bu  (2-21) 

y  =  Cz  (2.22) 

where 

A  =  3^  AS,  B  =  S~1B,  C  =  CS  (2.23) 

2.2  Some  Elements  of  System  Theory 

In  this  thesis  certain  key  elements  of  systems  theory  appear  frequently.  The  no¬ 
tions  of  stability,  controllability,  and  observability  of  a  control  system  are  essential. 
They  are  used  in  their  most  general  form  as  well  as  specializations  for  particular 
situations.  For  example,  in  Section  4.6  we  will  need  to  show  that  certain  example 
systems  are  locally  accessible,  locally  observable,  and  asymptotically  stable. 
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The  material  contained  in  this  section  is  drawn  mainly  from  texts  by  Khalil  [79], 
Nijmeijer  and  van  der  Schaft  [121],  Vidyasagar  [161],  and  class  notes  in  Geometric 
Control  presented  by  Dayawansa  [37]  at  the  University  of  Maryland.  For  proofs 
we  refer  to  the  literature. 

Stability 

Here  we  present  some  standard  definitions  and  results  on  the  local  stability  of  a 
time- invariant  system  without  inputs,  i.e., 

x(t)  =  f(x(t))  (2.24) 

where  f  :  D  C  Rn  — >  Rn  is  locally  Lipschitz  so  that  there  exists  a  unique  solution 
on  an  interval  [0,d]. 

We  are  concerned  with  the  stability  of  equilibrium  points.  Without  loss  of 
generality  we  can  assume  that  the  system  has  an  equilibrium  at  0,  i.e.,  /( 0)  =  0. 

Definition  2.2.1  (Stability  of  Equilibrium)  The  equilibrium  point  x  =  0  of 
system  (2.24)  i s  1°  be  stable  if  for  any  neighborhood  U  of  0  there  exits  a 
neighborhood  V  of  0  such  that  if  x( 0)  G  V  then  the  solution  x(t,  0,x(0))  belongs  to 
U  for  all  t  >  0.  □ 

Remark  2.2.2  The  equilibrium  point  x  =  0  of  (2.24)  sozd  1°  be  unstable  if  it 
is  not  stable.  □ 

Definition  2.2.3  (Asymptotic  Stability  of  Equilibrium)  The  equilibrium 
point  x  =  0  of  (2.24)  is  said  to  be  asymptotically  stable  if  it  is  stable  and  there 
exists  a  neighborhood  W  such  that  if  x( 0)  G  W  then 

hm  x(t,0,a;(0))  =  0  (2.25) 

□ 
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Definition  2.2.4  (Region  of  Attraction)  Let  x  =  0  be  asymptotically  stable 
for  the  system  (2.24).  The  region  of  attraction  is  defined  as  the  set 

|x0  G  Rn  :  lim  x(t,  0,xo)  =  o}  (2.26) 

[  t—>  OO  J 

i.e.,  the  set  of  points  from  which  the  trajectory  approaches  the  origin  as  t  — *  oo. 

□ 

Definition  2.2.5  (Exponential  Stability  of  Equilibrium)  The  equilibrium 
point  x  =  0  of  (2.24)  is  said  to  be  exponentially  stable  if  there  exist  constants 
k  >  0  and  7  >  0,  such  that 

II  %(t)  ||  <  kx(0)  exp  (— yf)  t  >  0  (2.27) 

□ 


Remark  2.2.6  Depending  on  the  situation,  stability  (asymptotic  stability)  of 
(2.24)  can  be  verified  via  the  direct  method  of  Lyapunov,  indirect  method  of  Lya¬ 
punov,  or  the  invariance  principle  of  LaSalle.  Instability  can  be  verified  via  a 
theorem  of  Cetaev.  Since  we  do  not  explicitly  use  these  results  in  this  thesis,  we 
refer  the  reader  to  the  literature.  □ 

Controllability 

Here  we  present  some  standard  definitions  and  results  regarding  the  controllability 
and  reachability  of  a  nonlinear  system,  some  of  which  pertain  specifically  to  the 
affine  system  (2.12).  In  addition,  we  introduce  the  notion  of  asymptotic  reachabil¬ 
ity,  which  appears  in  Chapter  4.  See  [121]  for  background  on  the  notion  of  a  Lie 
algebra. 
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Definition  2.2.7  (Controllable  System)  The  system  (2.12)  is  said  to  be  con¬ 
trollable  if  for  any  two  points  x\,x2  in  M  there  exists  a  finite  time  T  and  an 
admissible  input  u  :  [0,  T]  — »  U  such  that  x(T,  0,  x\,  u)  =  x2.  □ 

Definition  2.2.8  (Reachable  Set)  The  reachable  set  Rv  (x0,  T)  from  xq  at  time 
T  >  0  following  trajectories  which  remain  for  t  <  T  in  the  neighborhood  V  of  xo 
is  defined  as  the  set  of  all  points  x  G  M  for  which  there  exists  u  :  [0,  t\  — >  U  such 
that  x(t,  xo,  0,  u)  G  V ,  t  G  [0,  T]  and  x(T)  =  x.  We  also  denote 

Rt  (xo)  =  u r<T  Rv  (x0,  T )  (2.28) 

□ 

Definition  2.2.9  (Locally  Accessible  System)  The  system  (2.12)  is  said  to  be 

(i)  locally  accessible  from  x0  if  Rf  (. x0 )  contains  a  non-empty  open  set  of  M  for 
all  neighborhoods  V  of  x0  and  all  T  >  0; 

(ii)  locally  accessible  if  the  condition  in  (i)  holds  for  every  xq  G  M; 

(Hi)  locally  strongly  accessible  from  xq  if,  for  any  neighborhood  V  of  xq,  the  set 
Rv  (. x0 ,  T)  contains  a  non-empty  open  set  for  any  T  >  0  sufficiently  small; 

(iv)  locally  strongly  accessible  if  the  condition  in  (Hi)  holds  for  every  xo  G  M . 

□ 


Definition  2.2.10  (Accessibility  Algebra,  Accessibility  Distribution) 

For  the  system  (2.12)-(2.13),  the 

(i)  accessibility  algebra  C  is  the  smallest  subalgebra  of  the  Lie  algebra  of  vector 
fields  on  M  that  contains  {f,gi,...,gn}; 
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(ii)  strong  accessibility  algebra  Co  is  the  smallest  subalgebra  that  contains 
gi, . . . ,  gm  and  satisfies  [/,  X]  G  Co  for  all  X  e  Co; 

(in)  accessibility  distribution  C(x)  at  x  G  M  is  the  distribution  generated  by  the 
accessibility  algebra,  i.e., 

C(x)  =  span{X(x)  :  Ais  a  vector  field  in  C} 

(iv)  strong  accessibility  distribution  C'o(x)  at  x  G  M  is  the  distribution  generated 
by  the  strong  accessibility  algebra,  i.e., 

Co(x)  =  span{X(x)  :  Xis  a  vector  field  in  Co} 


□ 


Theorem  2.2.11  For  the  system  (2.12)-(2.13),  if 

(i)  dim(C'(a;o))  =  n  then  the  system  is  locally  accessible  from  x0; 

(ii)  dim  (C'fx))  =  n  for  all  x  G  M  then  the  system  is  locally  accessible; 

(Hi)  dim  (Co(a;o))  =  n  then  the  system  is  locally  strongly  accessible  from  xo; 

(iv)  dim(Co(a;))  =  n  for  all  x  G  M  then  the  system  is  locally  strongly  accessible. 

□ 


Definition  2.2.12  (Asymptotically  Reachable  System)  The  system  (2.12) 
is  said  to  be  asymptotically  reachable  from  x0  on  a  neighborhood  W  of  x0  if  for 
each  x  G  W  there  exists  a  u  such  that  x(t ,  0,  xo,  u)  G  W  for  all  t  >  0  and 


lim  x(t,  0,  xo,  u)  =  x 

t—>  OO 


□ 
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Observability 


Here  we  present  some  standard  definitions  and  results  regarding  the  observability 
of  a  nonlinear  system,  some  of  which  pertain  specifically  to  the  affine  system  (2.12)- 
(2.13). 

Definition  2.2.13  (Indistinguishable  States)  Two  states  Xi,X2  G  M  are  said 
to  be  H-indistinguishable  (denoted  X\IV  x2  for  system  (2.12)-(2.13)  if  for  each 
admissible  input  u  :  [0,  t\  — »  U,  T  >  0,  with  the  property  that  x(t,0,xi,u)  and 
x(t,  0,  x2,  u )  both  remain  in  V  for  t  <  T,  the  output  function  t  y(t,  0,  x±,  u )  for 
t  >  0  and  initial  state  x(0)  =  X\  and  the  output  function  t  (->•  y(t ,  0,  x2,  u)  for  t  >  0 
and  initial  state  x(0)  =  x2  are  identical  on  their  common  domain  of  definition.  □ 

Definition  2.2.14  (Observable  System)  The  system  (2.12)-(2.13)  is  said  to  be 
observable  if  x i  IM  x2  implies  that  x\  =  x2.  □ 

Definition  2.2.15  (Locally  Observable  System)  The  system  (2.12)-(2.13)  is 
said  to  be 

(i)  locally  observable  at  xo  if  there  exists  a  neighborhood  W  of  xq  such  that  for 
every  neighborhood  V  C  W  of  x0  the  relation  x0  Ivx i  implies  that  x\  =  x0, 
i.e.,  indistinguishability  implies  equality; 

(ii)  locally  observable  if  it  is  locally  observable  at  each  xo- 

□ 

Definition  2.2.16  (Zero-state  Observable  System)  The  system 
(2.12)-(2.13)  is  said  to  be 

(i)  locally  zero-state  observable  if  there  exists  a  neighborhood  W  of  0  such  that 
for  each  x  G  W ,  if  y(t,  0,  x,  0)  =  0  for  t  >  0  then  x(t,  0,  x,  0)  =  0  for  t  >  0; 


30 


(ii)  zero-state  observable  if  the  above  holds  for  all  x  G  M. 


□ 


Definition  2.2.17  (Observation  Space,  Observability  Codistribution) 

For  the  system  (2.12)-(2.13),  the 

(i)  observation  space  O  is  the  linear  space  of  functions  on  M  containing 

hi, ...  ,hp  and  all  repeated  Lie  derivatives  LXlLX2  •  •  •  LXkhj  for  j  G  p  and 
with  Xi,  i  =  1,2,...  in  the  set  {/,  gi, ... ,  gn}; 

(ii)  observability  codistribution  dO  at  x  G  M  is  defined  by 

dO(x)  =  span  {dH(x)  :  H  G  O } 


□ 


Theorem  2.2.18  For  the  system  (2.12)-(2.13),  if 

(i)  dim(dO(xo))  =  n  then  the  system  is  locally  observable  at  xq; 

(ii)  dim  ( dO{x ))  =  n  for  all  x  G  M  then  the  system  is  locally  observable. 

□ 


2.3  Principal  Component  Analysis 

Principal  component  analysis  (PCA)  refers  to  a  particular  type  of  orthogonal  de¬ 
composition  for  a  matrix-valued  signal  F(t)  as  described  below.  The  signal  is 
represented  by  a  piecewise  continuous  map  F  :  1R+  — >  R"xm.  We  use  (and  adapt 
somewhat)  the  terminology  presented  by  Moore  [109]. 

The  Gramian  matrix  is  an  object  of  primary  interest. 
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Definition  2.3.1  (Gramian  Matrix)  Given  a  piecewise  continuous  map 
F  :  [ti,t2]  — >  Rnxm,  the  Gramian  matrix  W2  [ti,f2]  for  F  is  defined  by 

W2  [ti ,  t2]  =  f  2  F  (t)  FJ  (t)  dt  (2.29) 

Jti 

□ 

We  usually  deal  with  signals  on  the  interval  [0,  oo)  (inhnite-time-horizon)  and  will 
use  the  term  Gramian  matrix  and  notation  W2  to  mean  JF2[0,  oo)  unless  otherwise 
noted.  Gramians  with  hnite  time  horizons  are  useful  in  situations  where  the  signals 
of  interest  grow  unbounded,  e.g.,  unstable  systems. 

The  Gramian  matrix  W2  is  a  non-negative  definite  matrix.  Therefore,  it  has 
n  non-negative  real  eigenvalues  uf  >  . . .  >  a2  >  0  and  n  corresponding  mutually 
orthogonal  unit  eigenvectors  i>i, ...  ,vn  (we  ignore  the  case  where  W2  has  repeated 
eigenvalues  and  Jordan  blocks  of  order  2  or  higher). 

The  standard  Fourier  analysis  tells  us  that  any  signal  F  :  [G,t2]  — >  Rnxm  can 
be  represented  as  the  linear  combination  of  dyads 

n 

F(t)  —  ^2  Vi  aj (t)  (2.30) 

i= 1 

where 

aj  (t)  =  vj  F  (t) ,  i£n  (2-31) 

correspond  to  the  Fourier  coefficients. 

Remark  2.3.2  PCA  refers  to  an  orthogonal  decomposition  (2.30)  of  signal  F(t ) 
in  which  the  basis  vectors  Vi,  i  G  n  are  the  unit  eigenvectors  ofW2.  □ 

Remark  2.3.3  Regarding  (2.30),  we  use  the  following  standard  terminology,  re¬ 
ferring  to  the  i-th,  respectively,  principal  component  af1  (t),  component  vector  v^, 
component  magnitude  at,  and  component  function  vector  afit).  □ 
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The  PCA  enjoys  some  useful  properties. 


Proposition  2.3.4  (Moore  [109])  For  F  :  1R+  — *  Rnxm  with  PCA  given  by 
(2.30)  the  following  relationships  hold: 


f  aj  (t)  aj(t)  dt  =  0,  i^j 

J  t\ 

C2  9 

(2.32) 

/  \\ai(t)  \\  at  =  of 

Jti 

C2  9  JL  r, 

(2.33) 

/  =  Y,ai 

Jt 1  i= i 

(2.34) 

where  ■  ||F  denotes  the  Frobenius  norm. 

□ 

The  efficiency  of  the  PCA  as  an  orthogonal  decomposition  is 

ing  result. 

Proposition  2.3.5  (Moore  [109])  Let  SF  denote  the  space 

SF  =  {v  :  v  e  Im {F(t)),  t  e  [ti,t2]} 

due  to  the  follow- 

Let  k  be  a  fixed  integer,  1  <  k  <  n.  Over  the  class  of  piecewise  continuous  FA{t) 

satisfying  dim  (Sjr^)  =  k,  the  residuals 

JF  =  /  ||  F(t)  —  FA(t)  \\F  dt 

J  t\ 

P2  -r  ...  2 

(2.35) 

Js  =  max  /  vT  ( F{t )  —  FA(t))  dt 

II v  ll=1 

are  minimized  by 

(2.36) 

FA{t)  =  Fk(t )  =  53  ViOiT(t) 

i= 1 

with  error  residuals 

(2.37) 

n 

jf  =  E 

i=k-\- 1 

(2.38) 

Js  =  °i+ 1 

(2.39) 

□ 
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Remark  2.3.6  The  set  Sf  is  spanned  by  those  component  vectors  of  W2  corre¬ 
sponding  to  non-zero  component  magnitudes.  □ 

Remark  2.3.7  Proposition  2.3.5  says  that  the  most  efficient  k-th  order  approxi¬ 
mation  to  F  is  given  by  the  PC  A.  □ 

2.4  Hilbert  Spaces 

The  notion  of  a  Hilbert  space  appears  prominently  in  this  thesis,  particularly  in 
the  context  of  second-order  stochastic  processes  and  stochastically  excited  systems. 
We  work  with  several  examples  of  Hilbert  spaces  and  frequently  use  the  concepts 
of  orthogonality,  basis,  and  separability.  The  material  contained  in  this  section  is 
standard.  It  is  drawn  mainly  from  texts  by  Akhiezer  and  Glazman  [3]  and  Gohberg 
and  Goldberg  [56].  We  refer  to  the  literature  for  the  proofs. 

Definition  2.4.1  (Hilbert  Space)  A  Hilbert  space  Pi  is  a  vector  space  over  1R 
or(D  together  with  an  inner  product  and  which  is  complete  as  a  metric  space. 
□ 

Remark  2.4.2  The  norm  is  defined  as  ||  0  ||  =  for  <f>  ePL  and  the  metric 

is  defined  as  d  (0,  f)  =  ||  0  —  0  ||  for  0,0  e  PL.  The  members  of  a  Hilbert  space 
are  called  elements  or  vectors.  In  this  thesis,  we  consider  only  Hilbert  spaces  over 

1R.  □ 

The  concepts  of  orthogonality  and  orthonormal  sets  will  be  crucial. 

Definition  2.4.3  (Orthogonal  Vectors)  Two  distinct  vectors  0  and  f  in  a  Hil¬ 
bert  space  PL  are  said  to  be  orthogonal  if 

(0,0)  =  0  (2.40) 
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Definition  2.4.4  (Orthonormal  Set)  A  countable  collection  of  vectors 
$  =  {0i,  02,  •  •  •}  in  a  Hilbert  space  H  is  said  to  be  an  orthonormal  set  if  any  two 
distinct  vectors  0*,  0j  G  fI>,  i  ^  j  are  orthogonal  and  ||  0*  ||  =  1  for  all  i  >  1.  □ 

Definition  2.4.5  (Complete  Orthonormal  Set)  An  orthonormal  set 
$  =  {0i,  02, . . .}  in  a  Hilbert  space  %  is  said  to  be  complete  in  H  if  there  exists 
no  vector  in  H,  except  the  zero  vector,  that  is  orthogonal  to  every  vector  in  $.  □ 

When  a  Hilbert  space  contains  an  orthonormal  set,  every  element  of  the  Hilbert 
space  can  be  represented  as  a  convergent  series  expansion. 

Proposition  2.4.6  (Series  Expansion  Representation)  //  $  =  {0! ,  02 ,  •  •  •} 

is  an  orthonormal  set  in  H  then  for  each  y  G  H,  the  series 

OO 

X](2/,0fc)0fc  (2-41) 

k= 1 

converges.  Conversely,  if 

V  =  <l>k  (2.42) 

k 

then  ak  =  (y,0k).  □ 

We  wish  to  establish  conditions  under  which  every  vector  in  the  Hilbert  space 
is  guaranteed  to  have  the  stated  expansion. 

Definition  2.4.7  (Orthonormal  Basis)  A  countable  orthonormal  set 
$  =  {0i,  02, . . .}  is  said  to  be  an  orthonormal  basis  for  H  if  for  each  y  G  H  and 
for  some  a±,  o2,  •  •  •  G  R 

y  =  Y^®i0i  (2-43) 

i 

□ 
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Remark  2.4.8  By  the  previous  result  we  know  that  cq  =  ( y,<f>i ).  Each  ( y,4>i )  is 
called  a  Fourier  coefficient  ofy^H.  □ 

Proposition  2.4.9  The  orthonormal  set  <f>  =  {01;  (f)2, . . .}  is  an  orthonormal  basis 
for  the  Hilbert  space  H  if  and  only  if  <f>  is  complete  in  H.  □ 

Remark  2.4.10  Thus,  if  there  exists  a  complete  orthonormal  set  in  the  Hilbert 
space,  then  every  element  of  the  Hilbert  space  can  be  expanded  in  terms  of  the  basis 
vectors  and  Fourier  coefficients.  □ 

It  is  logical  to  now  ask  under  what  conditions  a  Hilbert  space  will  contain  such 
a  complete  orthonormal  set. 

Definition  2.4.11  (Separable  Hilbert  Space)  A  Hilbert  space  H  is  separable 
if  H  contains  a  countable  set  which  is  dense  inTL.  □ 

Proposition  2.4.12  A  Hilbert  space  contains  an  orthonormal  basis  if  and  only  if 
it  is  separable.  □ 

Remark  2.4.13  In  summary,  we  find  that  those  Hilbert  spaces  that  are  separable 
contain  a  countable,  complete,  orthonormal  set  of  vectors,  i.e.  an  orthonormal  ba¬ 
sis  for  the  Hilbert  space,  in  which  every  vector  in  the  Hilbert  space  can  be  expanded. 
□ 


Finally,  we  note  the  following  result  which  states  that  if  an  orthonormal  basis 
exists,  it  is  not  unique. 

Proposition  2.4.14  (Non-uniqueness  of  Orthonormal  Basis)  Given  a  com¬ 
plete  orthonormal  set  of  vectors  {fa,  i  =  1,2,.. the  set  {fa,  *  =  1,2,...}  where 

fa  =  av  (2.44) 

j 
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for  coefficients  satisfying 


^  )  Otik  &jk  dij  (2.45) 

k 

is  also  a  complete  orthonormal  set  of  vectors.  □ 

Now  we  present  some  important  examples  of  Hilbert  spaces  that  we  will  use  in 
this  thesis. 

Example  2.4.15  (Rn)  The  space  of  n-tuples  (xi, . . . ,  xn)  of  real  numbers  for  which 

n 

Y  \xf\2  <  oo  (2-46) 

i= 1 

is  denoted  Rn  and  is  an  n-dimensional  Hilbert  space.  □ 

Example  2.4.16  (i2)  The  space  of  infinite  sequences  (xi,x2, . . .)  of  real  numbers 
for  which 

OO 

Y  \Xi\ 2  <  00  (2.47) 

i= 1 

is  denoted  l2  and  is  an  infinite- dimensional  Hilbert  space.  □ 

Example  2.4.17  {C2(fD))  The  space  of  real-valued  Lebesgue-measurable  square- 
integrable  functions  f  on  a  domain  V  such  that 

[  \f  (x)\2  dx  <  oo  (2.48) 

Jv 

is  denoted  C2{fD)  and  generally  is  an  infinite- dimensional  Hilbert  space.  (It  is 
actually  a  Hilbert  space  of  equivalence  classes  of  functions  but  we  can  treat  it  as 
a  space  of  functions  by  identifying  functions  which  are  equal  almost  everywhere.) 
The  inner  product  on  C2  is  given  by 

(fi9)c2  =  jv  f  (x)  9  (*)  dx  (2-49) 

□ 
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Remark  2.4.18  In  this  thesis  we  often  deal  with  the  spaces  £2[a,  b\,  £2[ 0,  oo),  and 

£2(— oo,  0].  □ 

We  are  especially  concerned  with  the  following  property  of  the  above  Hilbert 
spaces. 

Fact  2.4.19  All  of  the  above  Hilbert  spaces  are  separable,  i.e.  contain  a  countable 
orthonormal  basis.  □ 

Moreover,  we  have  the  following  result. 

Proposition  2.4.20  Any  two  separable  infinite- dimensional  Hilbert  spaces  are 
isomorphic.  □ 

Remark  2.4.21  Actually,  we  can  make  the  stronger  statement  that  any  two  sep¬ 
arable  Hilbert  spaces  are  linearly  isometric.  Hence,  £2  and  £2  are  indistinguishable 
as  Hilbert  spaces.  □ 

2.5  Stochastic  Processes 

This  thesis  relies  heavily  on  the  theory  of  continuous-parameter  stochastic  pro¬ 
cesses.  In  this  section,  we  set  up  the  mathematical  framework.  The  material 
contained  in  this  section  is  standard.  It  is  drawn  mainly  from  texts  by  Ash  and 
Gardner  [8],  Astrom  [10],  Davis  [36],  Papoulis  [126],  and  Wong  [166],  and  class 
notes  in  Random  Processes  presented  by  Narayan  [114]  at  the  University  of  Mary¬ 
land.  Some  basic  elements  of  probability  theory  are  needed,  including  probability 
spaces,  measurable  functions,  and  expectation.  These  subjects  are  covered  in  the 
aforementioned  texts.  We  refer  to  the  literature  for  all  proofs. 
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In  what  follows,  we  use  the  notation  (VL,A,V)  to  denote  a  probability  space 
with  sample  space  hi,  associated  a-algebra  A,  and  probability  measure  V.  The 
terminology  random  variable  refers  to  an  Tl-measurable  ]Rn-valued  function  X 
defined  on  hi.  Such  a  function  is  often  called  a  random  vector  for  the  case  where 
n  >  1.  However,  we  use  the  term  random  variable  regardless  of  the  value  of  the 
positive  integer  n.  We  will  not  deal  with  complex-valued  random  variables.  We 
usually  suppress  the  dependence  of  X  on  w  G  O  and  write  X  as  shorthand  for 
X(u>).  The  expected  value  of  a  random  variable  X  is  defined  by 

E[X}=  [  X  (u>)  V(dtv)  (2.50) 

Jn 

Definition  2.5.1  (Stochastic  Process)  A  stochastic  process  {Xtl  t  e  T}  is  a 

family  of  Rn -valued  random  variables  indexed  by  a  real  parameter  t  and  defined  on 
a  common  probability  space  (Q,  A,V).  □ 

Remark  2.5.2  The  parameter  set  T  is  usually  taken  to  be  an  interval  [a,  b]  where 
a  <  b.  In  the  cases  where  a  =  —  oo,  b  =  oo,  or  both,  the  interval,  respectively,  is 
(—oo,6];  [a,  oo),  or  (—00,00).  The  parameter  t  represents  time  unless  otherwise 
specified.  □ 

Remark  2.5.3  The  tv-dependence  is  suppressed  in  the  notation  {Xt,  t  e  T}  which 
is  shorthand  for  {X(tv,t),  tv  G  12,  t  G  T}.  □ 

Remark  2.5.4  Similarly,  we  can  define  a  stochastic  process  with  multiple  param¬ 
eters,  e.g.,  a  two-parameter  stochastic  process  {XtjX,  t  e  T,  x  G  V}  indexed  by  two 
real  parameters  t  and  x  with  respective  index  sets  T  and  V.  In  this  example,  the 
parameters  t  and  x,  respectively,  typically  represent  time  and  a  spatial  variable.  □ 
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Definition  2.5.5  (Sample  Path)  For  each  to  G  Q,  {W,  t  £  T}  is  a  R n -valued 
function  defined  on  T  and  is  called  a  sample  function  or  sample  path  of  the  process. 
□ 

Remark  2.5.6  In  addition,  we  also  note  that,  by  definition,  for  each  t  e  T,  the 
functionXt  :  — *  Rn  is  a  random  variable.  □ 

Transition  Properties 

When  working  with  stochastically  excited  systems  we  will  encounter  processes  that 
possess  the  Markov  property. 

Definition  2.5.7  (Markov  Process)  A  process  { Xt ,  t  E  T}  is  said  to  be  a  Mar¬ 
kov  process  if  for  any  increasing  collection  t\ ,t2,  ■  ■  ■  ,tn  E  T 

V(Xtn  <  xn\Xu  =xv,v=l,...,n-l)  =  V(Xtn  <  x^X^  =  xn_i)  (2.51) 

□ 

Definition  2.5.8  (Transition  Density)  Let  {Xt,  t  E  T}  be  a  Markov  process. 
The  transition  function  of  the  process  is  defined  by 

P  (x,  t;  y,  s)  =  V  (Xt  <  x;  Xs  =  y)  (2.52) 

If  there  is  a  function  p  (x,  t;  y,  s )  such  that 

P  (x,  t;  y,s)  =  f  p(u,t;y,s)du  (2.53) 

J  —  OO 

then  we  call  p  (x,  t\  y,  s )  the  transition  density  function.  □ 

Remark  2.5.9  The  transition  density  function  p(x,t;y,s )  represents  the  proba¬ 
bility  density  of  being  in  state  x  at  time  t  given  that  the  process  is  in  state  y  at 
time  s.  □ 


40 


Time  Independence  and  Averaging  Properties 


Time  independence  and  averaging  properties  are  of  importance  for  our  purposes. 
We  wish  to  be  able  to  take  time  averages  in  lieu  of  ensemble  averages,  since  explicit 
probability  measures  may  not  be  available.  This  requires  the  property  of  ergodicity, 
which  in  turn  requires  the  property  of  stationarity. 

Definition  2.5.10  (Stationary  Process)  A  process  {Xt,  t  E  R}  is  said  to  be 
stationary  if  for  any  (h, . . . ,  tn)  the  joint  distribution  of  {Xtl+to,  Xt2+to, . . . ,  Xtn+to} 
does  not  depend  on  t0.  □ 

A  rigorous  definition  of  what  it  means  for  a  stationary  process  to  be  ergodic 
requires  additional  machinery  (see,  e.g.,  [166])  which  we  do  not  provide  here.  In¬ 
stead,  we  state  the  following  property  of  an  ergodic  process  which  is  of  primary 
interest  for  our  purposes. 

Proposition  2.5.11  Let  { Xt ,  t  E  R}  be  a  separable  and  measurable  ergodic  pro¬ 
cess.  Let  f  be  any  Borel  function  such  that  E[\f  (A0)|  ]  <  oo.  Then 

E[f(X0)]=  lim  I  f(Xt)dt  almost  surely  (2.54) 

T-4CXD  21  J-T 

Conversely,  if  (2.54)  holds  for  every  such  f ,  then  {Xt,  t  E  R}  is  ergodic.  □ 

Remark  2.5.12  By  stationarity,  E[f(Xt)]  =  E[f(X0)]  for  all  t.  We  inter¬ 
pret  Proposition  2.5.11  as  saying  that,  for  an  ergodic  process,  time  average  equals 
ensemble  average.  □ 

Second-Order  Processes 

A  class  of  stochastic  processes  of  great  importance  is  the  class  of  second-order 
processes.  In  this  thesis  we  make  extensive  use  of  the  covariance  function  of  a 
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stochastic  process.  In  order  for  a  process  to  have  a  covariance  function,  it  must  be 
a  second-order  process. 


Definition  2.5.13  (Second-Order  Random  Variable)  A  random  variable  X 
is  said  to  be  a  second-order  random  variable  z/ T^HXH2]  <  oo  where  ||  •  ||  denotes 
the  usual  Euclidean  norm.  □ 

Definition  2.5.14  (Second-Order  Stochastic  Process)  A  process 

{Xt,  t  G  T}  is  said  to  be  a  second-order  stochastic  process  if  for  each  fixed  t  G  T, 

Xt  is  a  second  order  random  variable.  □ 

Thus,  a  second-order  stochastic  process  is  a  parameterized  family  of  second- 
order  random  variables.  It  has  at  least  a  first  moment,  second  moment,  and  second 
central  moment,  called  its  mean,  correlation,  and  covariance. 

Definition  2.5.15  (Mean,  Correlation,  Covariance)  Let  {Xt,  t  e  T}  be  a 

second-order  process.  The  mean  function  //  :  T  — >  R",  correlation  function  1Z  : 
TxT  — *  Rnxn,  and  covariance  function  R  :  TxT  — *  Rnxn  are  defined,  respectively, 
as 


h(t) 

=  E[Xt] 

(2.55) 

1 Z(t,  s ) 

=  E[X,XJ] 

(2.56) 

R(t,  s ) 

=  E[(X,-iJ.(t))(X,-^s))T} 

(2.57) 

□ 

Remark  2.5.16  The  correlation  and  covariance  functions  are  sometimes  referred 
to,  respectively,  as  the  autocorrelation  and  autocovariance  functions.  □ 
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Remark  2.5.17  For  each  t  e  T,  the  mean  function  p{t)  is  an  n-vector.  For  each 
t,seT,  the  correlation  and  covariance  functions,  respectively ,  7 Z(t,  s )  and  R(t,  s), 
are  n  x  n-matrices.  □ 

Remark  2.5.18  For  a  process  with  zero  mean,  R(t,s )  =  7 Z(t,s)  and  the  covari¬ 
ance  and  correlation  functions  can  be  used  interchangeably.  □ 

A  covariance  function  satisfies  a  number  of  important  properties.  Two  of  im¬ 
portance  for  our  purposes  are  as  follows. 

Proposition  2.5.19  (Symmetry  of  Covariance)  Let  {Xtl  t  E  T}  be  a  second- 
order  process.  Its  covariance  function  is  symmetric,  i.e., 

R(t,  s)  =  R(s,t)  t,sET  (2.58) 

□ 


Proposition  2.5.20  (Non-negativity  of  Covariance)  Let  {Xt,  t  E  T}  be  a 

second-order  process.  Its  covariance  function  is  non-negative  definite,  i.  e. ,  for  any 
finite  collection  t\, ...  ,tn  and  real  constants  a\, ...  ,an 

n  n 

YJY,aiajR{ti,tj)>  0  (2.59) 

i= 1  j= 1 

□ 

We  are  interested  in  working  with  covariance  functions  that  are  time  indepen¬ 
dent.  This  can  be  ensured  by  assuming  that  the  second-order  process  is  stationary. 
However,  stationarity  is  stronger  than  we  actually  need. 

Definition  2.5.21  (Wide-Sense  Stationary  Process)  A  second- order  process 
{. Xt ,  t  G  R}  is  said  to  be  wide-sense  stationary  if  its  covariance  function  R(t,  s )  is 
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a  function  only  of  the  difference  t  —  s,  i.e., 


R(t,  s)  —  R(t  —  s)  t,sET  (2.60) 

□ 

Remark  2.5.22  A  stationary  second-order  process  is  wide-sense  stationary,  but 
the  converse  is  not  necessarily  true.  Also,  note  that  for  a  wide-sense  stationary 
process,  the  mean  function  must  be  a  constant,  i.e.,  p(t)  =  fi.  □ 

One  object  of  importance  that  is  associated  with  a  wide-sense  stationary  second- 
order  process  is  its  spectral  density  function,  which  will  appear  in  the  description 
of  white  noise. 

Definition  2.5.23  (Spectral  Density  Function)  The  spectral  density  function 
for  a  wide-sense  stationary  second- order  process  {Xt,  ieT}  with  covariance  R(t) 
is  defined  as 

/OO 

exp  (—i2nuT)  R  (r)  dr  v  G  R  (2-61) 

-OO 

□ 


Remark  2.5.24  The  inversion  integral  is 


/OO 

exp  ( i2nuT )  S  (u)  du  r  G  It 

-OO 


(2.62) 


□ 


We  will  be  working  with  series  expansions  and  stochastic  differential  equations. 
Thus,  we  need  to  take  limits  and  derivatives.  We  use  the  following  notions  of  limits 
and  continuity  when  dealing  with  a  second-order  process. 
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Definition  2.5.25  (Quadratic  Mean  Convergence)  A  sequence  of  random 
variables  jx^j  is  said  to  converge  in  quadratic  mean  to  X  if 


lim  E 

71— »  OO 


=  0 


(2.63) 


JW  -  X 

We  call  X  the  limit  in  quadratic  mean  (q.m.  limit)  of  jx*A  j  and  use  the  notation 

X  =  lim  in  q.m.  X^  (2.64) 


Definition  2.5.26  (Quadratic  Mean  Continuous  Process)  A  second-order 
process  {Xt,  t  G  T}  is  said  to  be  continuous  in  quadratic  mean  (q.m.  continuous) 
at  t  if 

=  0  (2.65) 

A  process  that  is  q.m.  continuous  at  every  t  6  T  is  said  to  be  a  q.m.  continuous 
process.  □ 


lim  E 


X 


t-\-h 


x. 


An  important  fact  about  q.m.  continuous  processes  that  we  will  use  is 

Proposition  2.5.27  (Continuity  of  Covariance)  If  {Xt,  t  &  T}  is  a  second- 
order  q.m.  continuous  process  then  its  covariance  function  is  continuous  at 

every  point  on  the  square  T  x  T.  □ 


Two-Parameter  Second-Order  Processes 

We  will  find  it  useful  to  redefine  some  of  the  above  notions  in  the  context  of  two- 
parameter  stochastic  processes.  Consider  a  two-parameter  second-order  process 
{XLx,  t  G  T,  x  G  T>}.  The  mean  function,  correlation  function,  and  covariance 
function,  respectively,  are  given  by 

pit,  x )  =  E[Xt)X] 
n(t,x,s,y)  =  E[Xt)X  XJS)y} 

R(t,x,s,y)  =  E[(XtjX  -  p(t,x))(XSty  -  p(s,y))J} 
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(2.66) 

(2.67) 

(2.68) 


It  is  often  useful  to  fix  one  of  the  two  parameters.  When  the  parameter  t  repre¬ 
sents  time  and  the  parameter  x  represents  space,  we  use  the  following  terminology. 
The  functions 


n(t,x,t,y) 

=  E[Xt,*Xl,\ 

(2.69) 

R(t,x,t,y ) 

=  E[(xt,x  -  !i(t,x))(Xtty  -  n(t,y))J} 

(2.70) 

7 Z(t,  x,  s,  x) 

=  EIXVX]J 

(2.71) 

R(t,  x,  s,  x) 

=  £[(Vm  -  ti(t,  x))(X,^  -  fi(s, x))T] 

(2.72) 

are  called,  respectively,  the  spatial  correlation ,  spatial  covariance ,  temporal  correla¬ 
tion,  and  temporal  covariance.  Sometimes,  the  term  two-point  precedes  the  object 
name,  e.g.,  two-point  spatial  covariance.  The  symbols  77  and  R  are  used  to  denote, 
respectively,  the  correlation  and  covariance  functions,  in  all  cases. 

If  a  two-parameter  second-order  process  is  wide-sense  stationary  with  respect 
to  time,  then  the  spatial  correlation  and  spatial  covariance,  respectively,  can  be 
written  in  terms  of  the  spatial  parameters  only,  i.e., 

Tl{t,x,t,y)  =  K(x,y ) 

R(t,x,t,y )  =  R{x,y) 

If  a  two-parameter  second-order  process  is  wide-sense  stationary  with  respect  to 
the  spatial  variable,  then  the  temporal  correlation  and  temporal  covariance,  re¬ 
spectively,  can  be  written  in  terms  of  the  temporal  parameters  only,  i.e., 

1Z(t,x,s,x )  =  1Z(t,s ) 

R(t,x,s,x )  =  R(t,s ) 

Finally,  the  notion  of  q.m.  continuity  must  be  considered  with  respect  to  one 
particular  parameter,  i.e.,  a  two-parameter  second-order  process 
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{XttX,  t  G  T,  x  G  T>}  is  q.m.  continuous  with  respect  to  t  E  T  if  for  each  x  G  V 


lim  E  [  ||  Xt+hx  -  Xt  x  ||2 1  =0  (2.73) 

h — ^0  -l 

and  similarly  for  q.m.  continuity  with  respect  to  x  G  V. 

Hilbert  Space  Properties 

It  will  be  necessary  to  collect  random  variables  as  members  of  a  Hilbert  space. 

Definition  2.5.28  (Linear  Operation  on  a  Second-Order  Process)  Let 

{Xt,  t  G  T}  be  a  second-order  process.  A  random  variable  Y  is  said  to  be  derived 
from  a  linear  operation  on  { Xt ,  t  G  T}  if  either  of  the  following  are  true: 

(i)  For  some  integer  N  and  times  {H, . . . ,  tjv} 

N 

Y  =  Y.otXt,  (2.74) 

1=1 

(ii)  Y  is  the  q.m.  limit  of  a  sequence  of  such  finite  linear  combinations. 

□ 

Definition  2.5.29  (' Hx )  The  collection  of  all  random  variables  derived  from  lin¬ 
ear  operations  on  a  process  {Xt,  t  G  T}  is  denoted  Fix-  □ 

Remark  2.5.30  The  set  Fix  is  generally  an  infinite- dimensional  Hilbert  space. 
It  is  separable,  and  so  is  linearly  isometric  with  I2  (see  Section  2.f).  The  inner 
product  on  Fix  is  given  by 

<■ Y,Z)nx=E[YZJ }  (2.75) 

□ 
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Gaussian  Processes 


A  special  case  of  a  second-order  stochastic  process  of  great  importance  for  our 
purposes  is  a  Gaussian  process.  We  first  need  to  define  what  we  mean  by  a  Gaussian 
random  variable. 

Definition  2.5.31  (Gaussian  Random  Variable)  A  second- order  random  vari¬ 
able  Z  with  n  =  E[Z }  and  a2  =  E[(Z  —  p)2)  is  said  to  be  Gaussian  if  a2  =  0,  in 
which  case  Z  =  /j  with  probability  1,  or 

Pr  (Z  <  a)  =  I  J—  exp 

./  — nn  \  )rrrrrA 


1  {z  ~  p) 

2 


a- 


dz 


(2.76) 


Remark  2.5.32  A  random  variable  that  is  Gaussian  is  also  referred  to  as  normal. 

□ 


Remark  2.5.33  A  Gaussian  random  n-vector  has  a  density  function  determined 
only  by  parameters  p  and  R,  given  by 


Pz{z) 


1 

(WyW 


exp 


p)J  R  1  (z  —  p) 


(2.77) 

□ 


Definition  2.5.34  (Gaussian  Process)  A  second-order  stochastic  process 
{Xt,  t  G  T}  is  said  to  be  a  Gaussian  process  if  for  some  integer  N  and  times 
{fi, . . . ,  f/v}7  every  finite  linear  combination  of  the  form 

N 

Z  =  ^aiXti  (2.78) 

i=l 

is  a  Gaussian  random  variable.  □ 
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The  Wiener  Process  (Brownian  Motion) 

One  important  special  case  of  a  Gaussian  process  is  the  Wiener  process,  which  is  the 
mathematical  abstraction  of  the  physical  phenomena  known  as  Brownian  motion. 
We  use  the  terms  Wiener  process  and  Brownian  motion  process  interchangeably. 
The  Wiener  process  is  a  zero  mean  process  with  certain  properties,  defined  as 
follows. 


Definition  2.5.35  (Orthogonal  Increments)  A  process  {Xt,  t  G  T}  is  said  to 
have  orthogonal  increments  if  for  any  non- overlapping  intervals  ( s,t )  and  ( s',t ') 

E\(Xt,-Xs,)(Xt-Xs)J  I  =0  (2.79) 


□ 


Remark  2.5.36  A  process  that  has  orthogonal  increments  is  said  to  be  an  orthog¬ 
onal  increments  process.  □ 

Definition  2.5.37  (Independent  Increments)  A  process  {Xt,  t  e  T}  is  said 
to  have  independent  increments  if  for  any  two  non- overlapping  intervals  ( s,t )  and 
( s',t '),  the  random  variables  (Xt>  —  Xs>)  and  (Xt  —  Xs)  are  independent.  □ 

Remark  2.5.38  A  process  that  has  independent  increments  is  said  to  be  an  inde¬ 
pendent  increments  process.  □ 

Remark  2.5.39  Clearly,  the  class  of  independent  increments  processes  is  a  sub¬ 
class  of  the  class  of  orthogonal  increments  processes.  □ 

Definition  2.5.40  (Stationary  Increments)  A  process  {Xt,  t  G  T}  has  station¬ 
ary  increments  if  the  variance  of  the  increment  (Xt  —  Xs)  depends  only  on  the 
distance  \t  —  s|;  i.e., 

E[(Xt  -  Xs )2]  =  E[(Xt+r  -  Xs+r )2]  r,  s,  t  e  T  (2.80) 
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Remark  2.5.41  A  process  that  has  stationary  increments  is  said  to  be  a  station¬ 
ary  increments  process.  □ 

Proposition  2.5.42  A  q  .m.  continuous  process  {Xt,  t  e  T}  is  said  to  have  sta¬ 
tionary  orthogonal  increments  if  and  only  if  its  covariance  function  is 

R(t,  s )  =  cr2  min(f,  s )  (2-81) 

□ 

Now  we  define  the  Wiener  process.  It  is  defined  for  positive  time,  usually  on 
the  interval  [0,  oo). 

Definition  2.5.43  (Wiener  Process)  A  process  |Wt,  t  e  R+|  is  said  to  be  a 
Wiener  process  or  a  Brownian  motion  process  if  it  has  zero  mean,  i.e.,  E[Wt]  =  0 
for  t  >  0,  and  it  has  stationary  independent  Gaussian  increments.  □ 

Remark  2.5.44  By  a  standard  Wiener  process  we  mean  that  Wo  =  0  and 
E[Wf\  =  1.  Thus,  a  standard  Wiener  process  is  a  Gaussian  process  with  mean 
p,(t)  =  0  and  covariance  function  R(t ,  s )  =  min(f,  s).  □ 

Remark  2.5.45  Some  important  properties  of  a  Wiener  process  are 

(i)  Any  sample  path  W(t )  is  continuous  everywhere  with  probability  1,  differen¬ 
tiable  nowhere,  and  of  infinite  length. 

(ii)  The  increment  ( Wt+h  —  Wt)  is  of  order  0(\/h),  i.e.,  dWt  is  proportional  to 
\fdt. 


□ 
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Even  though  the  sample  path  of  a  Wiener  process  is  nowhere  differentiable,  it 
is  often  written,  formally,  that  there  is  a  process  j(~t,  t  E  H+|  such  that 

C,  =  |lU  (2.82) 

and  the  converse 

Wt=  f  Csds  (2.83) 

Jo 

are  true  in  some  useful  sense.  Such  a  process  j()t,  t  E  R+|  is  called  white  noise.  A 
process  with  the  above  relationship  to  Brownian  motion,  i.e.,  its  formal  derivative, 
is  extremely  useful  in  applications,  even  though  it  does  not  exist  in  the  traditional 
sense.  We  elaborate  and  justify  its  relationship  with  a  Wiener  process  as  follows. 

White  Noise 

A  white  noise  is  usually  described  (and  sometimes  defined)  as  a  wide-sense  station¬ 
ary  process  with  a  spectral  density  function  that  is  constant  over  all  frequencies, 
i.e., 

S{u)  =  S0  vE  R  (2.84) 

This  description  has  the  following  implications: 

(i)  R( 0)  =  oo 

(ii)  R(t)  =  S(t)  S0 

Thus,  a  process  with  the  above  description  is  not  a  second-order  process  and  does 
not  have  a  well-defined  spectral  density.  In  fact,  a  white  noise  is  not  well-defined 
as  a  stochastic  process.  Rather,  it  is  a  generalized  process.  We  will  not  proceed 
with  a  digression  on  generalized  processes  here.  See  Arnold  [7]  for  an  exposition  on 
this  subject.  For  completeness,  we  include  the  definition  of  a  white  noise  process. 
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Definition  2.5.46  (White  Noise)  A  generalized  Gaussian  stochastic  process  <f>£ 
is  said  to  be  a  Gaussian  white  noise  process  if  it  has  mean  functional  E  [  ]  =0 

and  covariance  functional 

/OO 

4>(t)  ip(t)  dt  (2.85) 

-OO 

Remark  2.5.47  When  the  Wiener  process  is  considered  as  a  generalized  process, 
the  covariance  function  of  its  derivative  is  given  by 

R(t,  s)  =  S(t  —  s)  S0  (2.86) 

which  is  the  covariance  function  of  a  white  noise  process.  Thus,  white  noise 
jCt,  t  E  R+|  is  the  derivative  of  the  Wiener  process  {wt,  t  E  R+|  when  both  pro¬ 
cesses  are  considered  as  generalized  processes.  This  justifies  the  relationships  (2.82) 
and  (2.83).  □ 


It  suffices  for  our  purposes  to  use  the  formal  description  of  a  white  noise  pro¬ 
cess,  justified  by  noting  that  jCt,  t  E  R+|  is  never  used  outside  of  an  integral.  In 
particular,  an  expression  of  the  form 

[b(t<l>(t)dt  (2.87) 

J  a 


is  said  to  be  a  white  noise  integral.  Expression  (2.87)  is  merely  formal;  there  is  no 
stochastic  process  j£t,  t  E  R+|  for  which  such  an  integral  exits.  Rather,  it  is  to 
be  interpreted  as  a  stochastic  integral, 

rb 


/  m  dwt 


(2.88) 


J  a 


which  is  defined  in  Section  2.6. 
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2.6  Stochastically  Excited  Dynamical  Systems 


A  stochastically  excited  dynamical  system  is  a  control  system  with  white  noise 
injected  at  the  input  terminals.  These  systems  play  a  crucial  role  in  the  model 
reduction  approach  presented  in  this  thesis.  The  time  evolution  of  the  state  of 
such  a  system  is  not  governed  by  differential  equations  using  the  ordinary  Stieltjes 
calculus.  Instead,  it  is  necessary  to  work  with  the  stochastic  calculus,  including 
stochastic  differential  equations  (SDEs).  Moreover,  the  state  is  a  stochastic  pro¬ 
cess,  with  an  associated  probability  density  function,  the  evolution  of  which  is 
governed  by  a  pair  of  diffusion  equations. 

The  material  contained  in  this  section  is  based  on  that  presented  in  texts  by 
Arnold  [7],  Astrom  [10],  Davis  [36],  and  Wong  [166],  papers  by  Brockett  [22]  and 
Fuller  [51],  and  class  notes  in  Stochastic  Control  presented  by  Marcus  [101]  at  the 
University  of  Maryland.  It  relies  heavily  on  the  material  presented  in  Section  2.5. 
We  refer  to  the  literature  for  all  proofs. 

2.6.1  State  Equations 

Recall  the  form  of  the  state  equation  for  an  affine  control  system 

m 

x(t)  =  f  (t,  x(t))  +  J2  9i  (■ t ,  x(t))  Ui(t )  (2.89) 

i= 1 

where  for  purposes  of  generality  we  include  the  possibility  of  explicit  time  depen¬ 
dence.  By  a  stochastically  excited  dynamical  system,  we  mean  an  affine  control 
system  for  which  the  m  components  of  the  input,  ut,  i  G  m,  have  been  replaced  by 
the  sample  paths  of  m  Gaussian  white  noises,  t  G  R+j,  Am. 

The  evolution  equation  for  the  state  process  (Xt,  t  G  R+|  takes  the  form 
d  m 

-Xt  =  f{t ,  Xt)  +  £  gft,  Xt)  iCt)i  (2.90) 
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The  meaning  of  (2.90)  is  given  in  terms  of  the  stochastic  integral.  In  particular, 
given  a  function  <f>  :  x  R  — »  Rnxm,  a  stochastic  integral  is  a  quantity  of  the  form 

I  (0)  =  fb  (j)  (uj,  t)  dW  (lo,  t )  (2.91) 

J  a 

Because  jlW,  t  G  R+j  is  neither  differentiable  nor  of  bounded  variation,  (2.91) 
does  not  have  a  well-defined  interpretation  as  an  integral  in  the  ordinary  sense. 
Therefore,  it  is  necessary  to  define  what  we  mean  by  (2.91). 

We  use  the  following  norm  for  functions  (j)  :  X  IB  — >  Rn  x 


=  / 

“  —  I 


dt 


(2.92) 


The  stochastic  integral  is  defined  in  terms  of  a  step  function  or  a  sequence  of 
step  functions. 

Definition  2.6.1  ((w, f)-Step  Function)  Let  (j)  be  jointly  measurable  in  (tv,t) 
and  such  that  ||  </>  ||  <  oo.  If  there  exist  times  t0,...,tn,  independent  of  tv  as 
functions,  such  that  a  <  t0  <  ■  ■  ■  <  tn  <  b  and 


(j)(u,t)  =  <f>v{t)  tv<t<tv+ 1  v  =  1, . . .  ,n  —  1 
then  (f  is  said  to  be  an  (uqt)-step  function. 


(2.93) 


□ 


Definition  2.6.2  (Stochastic  Integral)  Let  0  be  jointly  measurable  in  (u,t)  and 
such  that  ||  (f  ||  <  oo.  The  quantity 

rb 


I  ((f)  =  j  <j)(u!,t)  dW  (u>,t) 

J  a 

is  said  to  be  a  stochastic  integral  defined  as  follows: 


(2.94) 


(i)  If  (j)  is  an  (u>,t) -step  function  then 

rb  n-1 

/  0  (u,  t )  dW  (u,  t)  =  E  M“)  Wi)  -  W{u,  u)}  (2.95) 


V=1 
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(ii)  Otherwise, 


<f>(u t)  dW (oo,  t ) 


lim  in  q.m. 

n— >oo 


4>n(v,t)  dW(u,t ) 


where  {<fn}  is  a  sequence  of  (u t)-step  functions  satisfying 


(2.96) 


lim  ||  (j)n 

n— >•  oo 


2 


=  o 


(2.97) 


□ 


Remark  2.6.3  The  existence  of  the  convergent  sequence  in  (2.97)  is  guaranteed 
(see,  e.g.,  [166]  Chap,  f  Prop.  2.1).  □ 

Remark  2.6.4  In  what  follows,  and  throughout  this  thesis,  we  take  the  limits  of 
integration  as  a  =  0  and  b  =  oo  unless  specified  otherwise.  In  this  case,  the 
stochastic  integral  is  defined  via  the  limit  in  q.m.  as  b  — >  oo.  □ 

The  mathematical  model  for  a  stochastically  excited  dynamical  system  uses 
the  notion  of  a  stochastic  differential  equation  and  a  white  noise  driven  differential 
equation,  both  of  which  are  interpreted  precisely  via  the  stochastic  integral. 

Definition  2.6.5  (Stochastic  Differential  Equation)  Given  functions 
f  :  R+  x  Rn  — >  Rn  and  gi  :  R+  x  Rn  — >  R",  /'  G  m,  a  stochastic  differential 
equation  (SDE)  is  an  equation  of  the  form 

m 

dXt  =  f(t ,  Xt)  dt  +  J2  9i{t,  xt)  ( dWt)i  (2.98) 

i= 1 

□ 

Definition  2.6.6  (Solution  of  SDE)  A  process  t  G  R+|  is  said  to  satisfy 
the  SDE  (2.98)  with  initial  condition  X0  =  X  if 
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(i)  The  quantities 


[  gi(s,  Xs)  (dWs)i  ien 
Jo 

are  capable  of  being  interpreted  as  stochastic  integrals. 

(ii)  For  each  t,  Xt  is  almost  surely  equal  to  the  random  variable  defined  by 

ft  ft  171 

X+  f(s,Xs)ds+  '£9i(s,Xs)(dWs)i 

Jo  Jo  i=1 

where  the  first  integral  is  of  ordinary  type  and  the  second  is  a  stochastic 
integral. 

□ 

Remark  2.6.7  Thus,  the  SDE  (2.98)  is  an  expression  that  means 

ft  ft  m 

Xt  =  X0+  f(s,Xs)ds+  Y,9i(s,Xs)(dWs)i  (2.99) 

Jo  Jo  i=1 

where  the  first  integral  is  of  ordinary  type  and  the  second  is  a  stochastic  integral. 

□ 


The  existence  and  uniqueness  of  a  solution  {Xt,  t  G  [0,6]}  of  SDE  (2.98)  is 
guaranteed  under  certain  regularity  conditions  on  /  and  g.  Furthermore,  under 
those  conditions,  the  unique  solution  is  a  Markov  process. 

Proposition  2.6.8  (Properties  of  Solutions)  Let  {Wt,  t  G  [0,6]}  be  a  Wiener 
process  and  X  be  a  second-order  random  variable.  Let  f(t,x )  and  g{t,x),  t  G 
[0,6],  x  G  1R,  be  measurable  in  (t,x).  Suppose  that  f  and  g  satisfy  the  following 
conditions: 

I  f(t,x)  -  f(t,y)  |  +  |  g(t,x)  —  g(t,  y)\  <  K\x-y\  (2.100) 

\f(t,x)\  +  \g(t,  x)\  <  I<  Vl  +  x2  (2.101) 

Then  there  exists  a  process  {Xt,  t  G  [0,6]}  such  that 
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(i)  {Xt,  t  G  [0,6]}  satisfies  the  SDE  (2.98)  with  initial  condition  X0  =  X. 

(ii)  {Xt,  t  G  [0,6]}  is  unique  with  probability  1. 

(in)  {Xt,  t.  G  [0,6]}  is  a  Markov  process. 

(iv)  {Xt,  t  G  [0,6]}  has  continuous  sample  paths  with  probability  1. 

□ 

Remark  2.6.9  The  condition  (2.100)  is  called  the  uniform  Lipschitz  condition 
and  the  condition  (2.101)  is  called  the  restriction  on  growth  condition.  The  con¬ 
stants  K  can  be  the  same.  If  the  restriction  on  growth  condition  is  violated,  we  get 
the  effect  of  an  “explosion”  of  the  solution,  i.e.,  a  finite  escape  time.  □ 

The  connection  between  a  stochastically  excited  system  and  SDEs  is  made 
using  the  notion  of  a  white  noise  driven  differential  equation. 

Definition  2.6.10  (White  Noise  Driven  Differential  Equation)  Given 

functions  f  :  R+  x  Rn  — »  Rn  and  gt  :  R+  x  Rn  — »  R",  i  G  m,  a  white  noise  driven 

differential  equation  is  an  equation  of  the  form  (2.90)  (repeated  below) 

d  m 

-Xt  =  /(f,Xt)  +  X>(6,Xt)  (C t)i 

where  for  each  i,  {(Ct)«>  f  £  R+|  is  a  Gaussian  white  noise.  □ 

The  interpretation  of  (2.90)  is  that  of  a  sequence  of  SDEs 

J  rn 

-  x[n)  =  f{t ,  x[n))  +  53  9i{t,  x{n) )  ( (2.102) 

at  i= i 

where  | ,  t  G  R+|  represents  a  sequence  of  Gaussian  processes  that  converges 
in  some  suitable  sense  to  a  white  noise,  yet  for  each  n,  j ,  t  G  R+|  has  well- 
behaved  sample  paths.  If  the  sequence  of  processes  |x{"\  t  G  R+|  converges  (say, 
in  q.m.)  to  a  process  lxt,  t  G  R+|  then  we  interpret  Xt  as  the  solution  of  (2.90). 
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It  was  shown  by  Wong  and  Zakai  [164,  165]  that,  given  the  above  interpretation, 
the  precise  mathematical  meaning  of  (2.90)  is  that  of  an  SDE  (given  elementwise) 

m 

(< dXt ) .  =  fi  (t,  Xt)  dt  +  Y,9i  ( t ,  Xt)  ( dWt )  ■  i  G  n  (2.103) 

i=l 

where 

1  TL  Ul  Q 

fi  (t,  Xt)  =  U  (; t ,  Xt)  +  -  E  E  #  ft  **)  9jk  ft  Xt)  ten  (2.104) 

^  j=l  fc=i 

and  gtj  =  (^).,  i  G  n,j  G  m. 

Definition  2.6.11  (Correction  Term)  The  second  term  on  the  right  side  of 
(2.104)  is  called  the  correction  term  (sometimes  referred  to  as  the  Ito-Stratonovich 
correction  term).  Its  appearance  is  due  to  the  fact  that  dWt  is  proportional  to  \fdtt 
(see  Remark  2.5.45).  □ 

Remark  2.6.12  To  summarize,  a  stochastically  excited  system  is  an  affine  control 
system  with  Gaussian  white  noise  injected  at  the  input  terminals.  It  is  modeled  by 
a  white  noise  driven  differential  equation  (2.90),  which  is  interpreted  as  an  SDE  of 
the  form  (2.103).  The  SDE  (2.103)  is  defined  in  terms  of  the  stochastic  integral, 
i.e.,  by  an  integral  equation  of  the  form  (2.99),  but  with  f  replaced  by  f,  i.e.,  the 
sum  of  f  and  the  correction  term.  □ 

Remark  2.6.13  When  the  functions  g%,  i  ^  m  are  independent  ofx,  i.e.,  gf-,x)  = 
gif),  then  the  correction  term  vanishes.  □ 

Remark  2.6.14  (Simulation)  Care  must  be  taken  in  order  to  correctly  imple¬ 
ment  a  numerical  simulation  of  SDE  (2.103).  In  particular,  the  continuous-time 
Wiener  process  must  be  approximated  by  a  sequence  of  Gaussian  random  variables. 
The  statistics  of  these  random  variables  must  be  chosen  in  a  manner  consistent  with 
the  approximation  scheme.  Details  are  provided  in  Appendix  C.  □ 
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2.6.2  Diffusion  Equations 


We  have  presented  the  mathematical  framework  regarding  the  time  evolution  of 
the  state  for  a  stochastically  excited  system.  The  state,  which  is  represented  by 
a  Markov  process,  has  an  associated  transition  probability,  which  also  evolves  in 
time.  The  transition  probability  of  a  process  satisfying  an  SDE  can  be  obtained  by 
solving  either  of  a  pair  of  parabolic  PDEs.  These  equations  are  called  the  backward 
and  forward  equations  of  Kolmogorov,  or  diffusion  equations. 

Let  the  process  {Xt,  t  e  T}  be  the  unique  solution  of  the  SDE  (2.98)  and 
have  transition  function  P(x,t;y,s )  and  transition  density  function  p(x,t;y,s ) 
(see  Definition  2.5.8).  The  forward  time  evolution  of  p(x,  t;  y ,  s )  is  governed  by  the 
forward  equation  of  Kolmogorov ,  also  known  as  the  Fokker-Planck  equation ,  given 
by 


^  (x,t-,y,s)  =  Cp  = 

n  Q 

-  2  (fi  (*»  x)  p  ( x >  y> s )) 

i=l  OXi 

•j  n  n  q2 

+nEE  rw  (ba  x)  p  Od  s)) 

z  i=l  j=l  U-Li 


with  initial  condition 


p(x,s;r/,s)  =  5  (x  —  y) 


(2.105) 


and  where 


bij  (: t ,  x)  =  53  (b  x)  gjk  (t,  x)  =  [g  (t,  x)]  (t,  x)] 


k= 1 


J  ij 


The  reverse  time  evolution  of  P(x,  t\  y,  s )  is  governed  by  the  backward  equation 
of  Kolmogorov ,  given  by 


dP 

ds 


(%,t;y,s)  =  C*  P  = 
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(2.106) 


n.  Q 

X  ft (S^y)  o- (p(x>t\y>s)) 

i= i  °yi 

y  n  n  q2 

+2SSMs’s)%%(p("1;i,,s)) 

with  terminal  condition 

1  x  >  y 
0  x  <  y 

Detailed  derivations  of  Equations  (2.105)  and  (2.106)  are  presented  in  [7,  166].  For 
a  history  and  derivation  of  the  Fokker-Planck  equation  from  a  physical  point  of 
view  see  [51]. 

Remark  2.6.15  The  operators  C  and  C*  are  linear  and  adjoints.  □ 

Remark  2.6.16  When  working  with  a  white  noise  driven  equation,  it  is  important 
to  keep  in  mind  that  the  functions  fl;  i  E  n,  must  incorporate  the  correction  term. 
□ 


Existence  and  uniqueness  of  solutions  to  Equations  (2.105)  and  (2.106)  can 
be  shown  under  suitable  regularity  conditions  (see,  e.g.,  [7,  166]).  However,  the 
following  result  by  Elliott  [42,  43]  (and  elaborated  upon  by  Brockett  [21,  22])  is 
more  useful  for  our  purposes  (because  it  appeals  to  our  control  theoretic  viewpoint). 

Theorem  2.6.17  (Elliott  [42,  43])  Suppose  that 

(i)  the  Lie  algebra  of  vector  fields  generated  by  {f,gi, . . . ,  gm }  consists  of  com¬ 
plete  vector  fields  on  a  manifold  M;  and 

(ii)  the  smallest  Lie  algebra  which  contains  gi,  ■  ■  ■ ,  gn  and  which  is  closed  under 
bracketing  with  f  spans  the  tangent  space  of  M  at  each  point. 
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Then  the  corresponding  SDE  (2.103)  defines  smooth  transition  densities  on  M .  □ 


Remark  2.6.18  By  completeness  of  a  vector  field  f,  we  mean  that  the  solution 
of  x  =  f  (x) ,  x(0)  =  xq  is  defined  for  all  time,  i.e.,  no  finite  escape  times  in 

forward  or  backward  time.  □ 

Remark  2.6.19  Theorem  2.6.17  states  that  strong  local  accessibility  of  the  corre¬ 
sponding  affine  control  system,  together  with  a  completeness  condition  on  vector 
fields,  guarantees  the  existence  of  smooth  transition  densities,  i.e.,  smooth  solutions 
of  the  Fokker-Planck  equation,  for  all  times  t.  We  apply  this  result  in  Section  4-3.2. 
□ 


In  many  applications,  the  functions  /*,  i  G  n,  and  i  G  n,  j  6  m,  (and 

hence  the  bif)  are  time-independent.  In  such  cases  we  are  often  interested  in  the 

steady-state  probability  density  (if  any)  which  p  approaches  as  t  becomes  large. 

dp 

In  the  steady-state,  —  vanishes,  i.e.,  the  probability  density  is  stationary,  and 

ot 

Equation  (2.105)  simplifies  to  the  stationary  Fokker-Planck  equation 

fl  <~v  "|  TL  TL 

0  =  faT  (fi  (x)Poo  0*0)  +  2  E  E  i)x.  i)x,  (ba  ( x)Poo  (®))  (2-107) 

where  p^  (x)  denotes  the  stationary  probability  density  (if  it  exists). 


Remark  2.6.20  The  dependence  of  the  transition  density  on  y  and  s  vanishes  in 
the  steady-state  case.  □ 


A  solution  Poo  (x),  if  it  is  to  represent  a  probability  density  function,  must  also 


satisfy 

Poo  (x)  >  0  x  G  Rn 

(2.108) 

and 

J^n  Poo  (x)  dx  =  1 

(2.109) 
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Definition  2.6.21  (Stationary  Solution)  A  solution  of  (2.107)  which  also  sat¬ 
isfies  (2.108)  and  (2.109)  is  called  a  stationary  solution  of  the  Fokker-Planck  equa¬ 
tion.  □ 

Boundary  conditions  for  (2.107)  are  assigned  as 

lim  poo  (x)  =  0  (2.110) 

x — yoo  v  7  v  7 

and  as  a  consequence 

lim  (x\  —  o  (2-111) 

z-s-oo  ox 

We  can  use  Theorem  2.6.17  in  order  to  establish  the  existence  of  a  smooth 
invariant  density.  In  addition,  Zakai  [169]  shows  how  to  prove  existence  via  a 
Lyapunov  criterion.  Moreover,  Fuller  [51]  argues  that  we  can  heuristically  assume 
the  existence  of  a  unique  stationary  asymptotically  stable  transition  density  if 

(i)  No  part  of  the  system  is  completely  isolated  from  the  effects  of  the  white 
noise. 

(ii)  The  system  has  restoring  forces  which  prevent  the  ensemble  from  dispersing 
to  infinity. 

We  shall  make  the  standing  assumption  that  the  stochastically  excited  systems 
we  work  with  yield  a  unique  stationary  asymptotically  stable  transition  density, 
unless  noted  otherwise.  We  call  this  transition  density  the  steady-state  density , 
stationary  density ,  or  invariant  density. 

Closed  form  solutions  of  (2.107)  exist  in  certain  special  cases  such  as  for  systems 
with  a  conservative  (Hamiltonian)  part  and  a  dissipative  part  [51,  171].  These 
solutions  will  be  exploited  in  Chapter  4. 
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2.7  Mechanical  Systems 


Mechanical  systems  whose  dynamics  can  be  described  by  the  Euler- Lagrange  or 
Hamilton  equations  of  motion  form  a  subclass  of  nonlinear  control  systems  that 
is  of  major  importance  in  this  thesis.  In  particular,  we  will  consider  mechanical 
systems  consisting  of  open  chained  rigid  links.  In  this  section  we  briefly  present  the 
mathematical  framework  for  working  with  this  subclass  of  systems.  The  material 
is  standard  and  drawn  mainly  from  texts  by  Murray,  Li,  and  Sastry  [112]  and 
Nijmeijer  and  van  der  Schaft  [121]  and  a  paper  by  Fuller  [51].  We  refer  to  the 
literature  for  all  proofs. 

The  motion  of  a  mechanical  system  can  be  described  by  a  set  of  variables  that 
completely  determines  the  configuration  of  the  system.  We  refer  to  such  a  set  of 
variables  as  generalized  coordinates ,  denoted  in  vector  form  as  q  =  (qi,. . .  ,qn)  6 
IFF1,  where  n  denotes  the  number  of  degrees  of  freedom  (DOF)  of  the  system. 
For  a  mechanical  system  consisting  of  rigid  links,  the  generalized  coordinates  are 
almost  always  chosen  to  be  the  angles  of  the  joints.  We  also  refer  to  the  qt  as  the 
generalized  positions  and  qt  as  the  generalized  velocities. 

We  express  the  external  forces  applied  to  the  system  in  terms  of  components 
along  the  generalized  coordinates.  These  forces  are  referred  to  as  generalized  forces, 
denoted  in  vector  form  as  F  —  (F\, . . . ,  Fn)  e  ]Rn.  For  the  rigid  link  system  with 
joint  angles  acting  as  generalized  coordinates,  the  generalized  forces  are  the  torques 
applied  about  the  joint  axes. 

The  kinetic  energy  K  of  the  system  is  a  function  of  the  generalized  positions 
and  velocities,  i.e.,  K  =  K  (. q ,  q).  For  a  system  of  rigid  links  it  is  usually  written  as 
the  sum  of  a  translational  component  and  a  rotational  component.  The  potential 
energy  U  is  a  function  of  position  only,  i.e.,  U  =  U  (q).  It  is  usually  written  as 
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the  sum  of  stored  energies  due  to  gravity  and  mechanical  stiffness.  The  dissipation 
energy  R  (also  called  the  Rayleigh  dissipation  function )  is  generally  a  function  of 
position  and  velocity,  i.e.,  R  =  R(q,q).  It  contains  terms  reflecting  generalized 
mechanical  damping. 

We  define  the  Lagrangian  as  the  difference  between  the  kinetic  and  potential 
energies  of  the  system,  i.e., 


L{q,q)  =  K(q,q)-U{q)  (2.112) 

The  equations  of  motion  for  the  system  can  be  derived  from  the  Lagrangian  L  and 
the  Rayleigh  dissipation  function  R  via  the  Euler- Lagrange  equations  of  motion. 

Theorem  2.7.1  (Euler-Lagrange  Equations  of  Motion)  The  equations  of 
motion  for  a  mechanical  system  with  generalized  coordinates  q  G  Rn,  generalized 
forces  F  G  Rn,  and  Lagrangian  L  are  given  by 

d  dL  dL  dR 

dt  dq.i  dqi  *  dq.i 

□ 

A  control  system  model  in  standard  state-space  form  is  obtained  from  the  Euler- 
Lagrange  equations  of  motion  by  interpreting  the  external  forces  as  the  control 
inputs  and  expressing  the  kinetic  energy  as 

k (q,q)  =  M (q)  q  (2.114) 

where  M  (q)  is  a  positive-definite  matrix  called  the  inertia  matrix  or  mass  matrix. 
The  equations  of  motion  can  be  written 

M  (q)  q  +  C  (q,q)  +  N  (q,q)  =  F  (2.115) 
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where  C  (q,  q)  represents  Coriolis  and  centrifugal  force  terms  and  N  (g,  q )  includes 
gravity  and  other  forces  which  act  at  the  joints  (e.g.,  torsional  damping,  stiffness). 
The  state-space  model  is  given  by 


(2.116) 


d  q  q  o 

dt  q  -M'1  (q)  (C(q,q)  +  N(q,q))  M~l  (q) 

We  dehne  the  generalized  momenta  p  =  (pi,...,pn)  G  Rn  in  terms  of  the 
generalized  coordinates  q  and  Lagrangian  L  via  the  Legendre  transformation 


Pi  = 


dL 

9(ji 


i  G  n 


(2.117) 


We  dehne  the  Hamiltonian ,  in  terms  of  the  generalized  positions  and  momenta,  as 
the  sum  of  the  kinetic  and  potential  energies  of  the  system,  i.e., 


H(q,p)  =  K(q,p)  +  U(q)  (2.118) 

The  Hamiltonian  H  and  Lagrangian  L  are  related  by 

H  (q,p)  =  (p,q)  ~  L(q,q)  (2.119) 


The  equations  of  motion  can  be  restated  in  coordinates  ( q,p ),  in  terms  of  the 
Hamiltonian  H ,  mass  matrix  M,  and  Rayleigh  dissipation  function  R  (reformulated 
in  terms  of  q  and  p),  in  the  obvious  vector  notation  as 


d_ 

dt 


q 

p 

8H 

dp 


( q,P ) 


dH  .  x  ,  dR  ,  , 

—  (q,p)  -  M  (q)  -Q-  (q,p) 


0 

1 


(2.120) 


Remark  2.7.2  An  advantage  of  this  formulation  is  that  the  equations  of  motion 
immediately  constitute  a  control  system  in  standard  state-space  form.  □ 


Remark  2.7.3  Using  either  formulation,  the  equations  of  motion  yield  a  state- 
space  model  of  dimension  2 n,  i.e.,  two  state  variables  per  DOF.  We  refer  to  such 
a  model  as  a  second-order  system.  □ 
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Remark  2.7.4  A  system  (2.120)  for  which  there  is  no  dissipation  or  forcing  is 
referred  to  as  Hamiltonian  or  conservative.  This  terminology  reflects  the  fact  that 
the  total  energy  of  the  system  remains  constant,  i.e.,  H  =  0.  We  refer  to  the 
system  (2.120)  as  a  Hamiltonian  system  perturbed  by  dissipation  and  forcing.  □ 

The  Poisson  bracket  is  generally  a  bilinear  map  from  C°°{M )  x  into 

C°°(M),  where  M  is  a  manifold,  satisfying  the  properties  of  skew-symmetry,  Jacobi 
identity,  and  the  Leibniz  rule  (see  [121]).  We  will  use  a  special  case  of  the  Poisson 
bracket  where  M  =  ]Rn  x  Rn,  i.e.,  represents  the  space  of  generalized  positions 
and  momenta,  defined  as  follows. 

Definition  2.7.5  (Poisson  Bracket)  Let  F  :  E"xR"  — »  R  andG  :  R"xR"  — » 
R  be  smooth  functions.  The  Poisson  bracket  of  F  and  G  is  the  bilinear  map  defined 
by 

n  ( r)  f1  riC1  r)  f1  FiC1  \ 

{F.G}(,.rt=I ;(-----)(,, p)  (q,p)  e  ]R"  x  R"  (2.121) 

We  will  use  the  following  lemma  in  Chapter  4. 


Lemma  2.7.6  Let  F  :  Rn  x  R"  — >  R  be  a  functional  of  the  Hamiltonian  H ,  i.e., 
F  —  F  (H(q,  p))  e  R.  Then  {F,  H}  =  0. 

Proof 


{F,H} 


y 

i= 1 
n 

E 

i— 1 
n 

E 

i=l 

o 


dF  OH  dF  OH 

dpi  dqi  dqi  dpi  _ 

dF  dH  dH  dF  dH  dH 
dH  dpi  dqi  dH  dqi  dpi 
dF  (dH  dH  dF  dHY 
dH  [  dp,  dqi  dqi  dp,  j  _ 


(2.122) 

(2.123) 

(2.124) 

(2.125) 
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Chapter  3 


Standard  and  Ad-Hoc 
Approaches  to  Model  Reduction 

3.1  Introduction 

This  chapter  introduces  two  prominent  and  related  approaches  for  deriving  low- 
order  approximations  to  high-order  nonlinear  system  models,  referred  to  generically 
as  the  POD  and  balanced  truncation.  Basic  versions  of  these  methods  have  become 
standard  model  reduction  tools  and  have  been  used  in  a  variety  of  application  areas 
during  the  past  two  decades. 

Although  the  POD  and  balanced  truncation  are  founded  in  rigorous  mathemat¬ 
ical  results,  application  of  these  standard  tools  to  model  reduction  for  nonlinear 
control  systems  requires  ad-hoc  assumptions  and  procedures.  Perhaps  the  most 
obvious  ad-hoc  procedure  is  linearization,  whereby  it  is  assumed  that  the  original 
nonlinear  model  can  be  approximated  (locally)  by  a  linear  system  derived  from  a 
Taylor  series  expansion.  The  literature  regarding  applied  balanced  truncation  to 
this  date  is  concerned  only  with  model  reduction  for  linear  systems.  The  POD,  on 
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the  other  hand,  can  be  applied  to  nonlinear  control  system  models,  but  assump¬ 
tions  about  locality,  and  situational  procedures,  are  still  needed.  The  POD  has 
been  applied  recently  in  situations  where  the  input  is  pre-determined  or  the  system 
is  expected  to  evolve  close  to  a  pre-specihed  trajectory. 

The  purpose  of  this  chapter  is  to 

•  provide  an  overview  of  the  state-of-the-art,  including  important  aspects  of 
the  underlying  theory,  computational  issues,  advantages  and  shortcomings, 
and  selected  applications; 

•  motivate  the  research  presented  in  Chapter  4;  and 

•  explain  the  methods  and  computational  tools  used  in  Chapter  6. 

The  POD  and  balancing  methods  for  determining  a  suitable  coordinate  transfor¬ 
mation  are  presented,  respectively,  in  Sections  3.2  and  3.3.  The  general  procedure 
for  component  truncation  is  outlined  in  Section  3.4.  We  summarize  and  make  some 
additional  remarks  in  Section  3.5.  The  subject  matter  relies  heavily  on  concepts 
introduced  in  Sections  2. 1-2.5. 


3.2  Proper  Orthogonal  Decomposition 

The  proper  orthogonal  decomposition  (POD)  of  a  second-order  stochastic  process 
is  one  member  of  the  class  of  representations  known  as  orthogonal  expansions  (the 
Fourier  series,  or  harmonic  decomposition,  is  another  example).  Its  usefulness  in 
the  area  of  model  reduction  stems  from  its  mathematical  properties  pertaining  to 
its  efficiency  in  terms  of  representing  an  ensemble  of  signals.  The  POD  is  also 
known  as  the  Karhunen-Loeve  expansion  (named  after  two  [76,  95]  of  the  several 
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scientists  who  are  credited  with  its  independent  discovery;  see  [15]),  and  in  certain 
contexts  as  principal  component  analysis  (PCA). 

An  orthogonal  expansion  of  a  second-order  stochastic  process  {Xt,  t  €  T}  is  an 
expression  of  the  form 

OO 

Xt  =  ^  CTj(t)  Zi  t  G  T  (3.1) 

1=1 

where  the  set  [Zil  Z2,  ■  ■ .}  is  an  orthonormal  basis  for  l~Lx  (see  Dehnition  2.5.29) 
and  the  coefficient  functions  {<Ji(t)  =  {Xt,  Zi)  ,  i  =  1,  2, . . .}  are  completely  de¬ 
terministic  and  square-integrable  on  T.  Representations  of  this  form  permit  the 
family  of  random  variables  {Xt,  t  G  T}  to  be  expressed  as  a  linear  combination  of 
a  countable  number  of  orthonormal  random  variables  {Z±,  Z2, . . .}. 

The  separable  Hilbert  space  Tix  contains  an  infinite  number  of  possible  or¬ 
thonormal  basis  sets  {Zi,  Z2, . . .}  (see  Proposition  2.4.14).  We  shall  see  that  the 
basis  derived  via  the  POD  is  an  advantageous  choice.  In  particular,  the  coefficient 
functions  {a\,  cr2: . . .}  form  an  orthonormal  set  in  C2(T ),  the  span  of  which  is  ca¬ 
pable  of  representing  all  members  of  the  ensemble,  and  the  individual  terms  in  the 
series  (3.1)  can  be  ranked  according  their  respective  relative  contributions  to  the 
energy,  on  average,  contained  in  members  of  the  ensemble.  This  ranking  allows 
for  an  efficient  representation  via  truncation  of  (3.1)  at  a  suitably  low  index.  The 
POD  and  its  properties  are  derived  mainly  using  the  spectral  theory  of  compact, 
self-adjoint,  integral  operators,  as  described  in  the  following  sections. 

The  exposition  contained  in  this  section  is  based  on  that  from  several  sources, 
but  we  believe  that  it  is  original  in  its  treatment,  in  particular  toward  illuminating 
the  applied  aspects  in  a  rigorous  way.  Rigorous  mathematical  treatments  are 
given,  from  a  purely  theoretical  standpoint  in  [8,  96,  166],  and  with  a  view  toward 
applications  to  model  reduction  in  [15,  67,  118,  119,  147,  149].  For  simplicity  we 
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introduce  the  main  results  in  the  context  of  one-parameter  processes.  Later,  results 
are  interpreted  in  terms  of  two-parameter  processes  for  purposes  of  application. 


3.2.1  Derivation 

Consider  an  ensemble  of  signals 

{I(wr):[a,^Rn,w6  0}  (3.2) 


Each  member  of  the  ensemble  (i.e.,  for  each  fixed  u>)  is  a  function  in  £2  [a,  b\.  We 
assign  to  the  ensemble  a  probabilistic  structure  including  an  associated  averaging 
operation  E[-\.  The  nature  of  the  randomness  is  not  important  for  the  sake  of 
this  discussion.  It  could  be  due,  e.g.,  to  strong  dependence  on  unpredictable  initial 
conditions.  We  assume  that  the  stochastic  process  {Xt,  t  G  [a,  b]}  is  second-order, 
and  without  loss  of  generality,  has  zero  mean,  i.e.,  E[Xt]  =0. 

Now,  consider  the  problem  of  determining  which  single  deterministic  function 
<f  E  £ 2  [a,  b],  is  most  similar,  on  average,  to  the  members  of  the  ensemble,  i.e.,  End 


Remark  3.2.1  The  function  <f>  is  most  nearly  parallel  to  signals  in  the  ensemble, 
on  average,  in  the  function  space  £2  [a,  b] .  □ 


The  maximization  problem  (3.3)  is  a  classical  problem  in  the  calculus  of  vari¬ 
ations.  A  necessary  condition  for  (3.3)  to  hold  is  that  (j)  be  an  eigenfunction 
of  the  integral  operator  with  kernel  given  by  the  two-point  covariance  function 


R(t,s)  =  E[xtXj], 


rb 

/  R(t,s)  (j)(s)ds  =  \<t>(t)  t  &  [a,  b\ 

J  a 
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Remark  3.2.2  The  spectral  theory  of  compact,  self-adjoint,  integral  operators  (see, 
e.g.,  [150])  ensures  that  the  maximum  in  (3.3)  is  achieved  and  corresponds  to  the 
largest  eigenvalue  Amax  of  the  integral  operator  in  (3.f).  Furthermore,  under  the 
condition  that  [ a,b ]  is  bounded,  Hilbert- Schmidt  theory  (see,  e.g.,  [56])  guarantees 
the  existence  of  a  countably  infinite  number  of  solutions  {0i,  02,  •  •  •}  of  (3.f).  □ 

The  key  result  is  Mercer’s  theorem  (see,  e.g.,  [56]),  which  gives  the  spectral 
decomposition  of  an  integral  operator  with  continuous,  self-adjoint,  non-negative 
definite  kernel. 

Theorem  3.2.3  (Mercer)  Letk( •,•)  be  a  continuous,  Hermitian  symmetric,  non¬ 
negative  definite  function  on  [a,b]  x  [a,  b].  If  {0i,  02,  •  •  •}  are  the  orthonormal 
eigenvectors  corresponding  to  the  non-zero  eigenvalues  { Ai ,  A2, . . .}  of  the  integral 
operator  with  kernel  k(-,  •)  then  for  all  t,  .s  £  [a,  b] 

OO 

k(t,  s)  =  J2  A i  <j>i(t)  0i(s)  (3.5) 

i= 1 

The  series  converges  absolutely  and  uniformly  on  [a,  b]  x  [a,  b\.  □ 

Remark  3.2.4  The  spectral  decomposition  of  the  covariance  is  given  by 

OO 

R(t,  s)  =  K  Ms)  (3.6) 

i= 1 

where  {0i,02,  •  •  •}  are  solutions  of  the  integral  equation  (3.f).  It  follows  from  non¬ 
negative  definiteness  of  R  that  the  eigenvalues  {Ai,A2,  ...}  are  non-negative.  By 
convention  they  are  ordered  such  that  A j  >  Aj+i.  □ 

The  POD  is  a  direct  consequence  of  Mercer’s  theorem. 
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Theorem  3.2.5  (Proper  Orthogonal  Decomposition)  Let  {Xt,  t  e  [a,  b\}  be 

a  zero-mean  q.m.  continuous  second-order  stochastic  process  with  covariance  func¬ 
tion  R(t,s).  The  process  Xt  has  an  orthogonal  decomposition 

N 

X(oj,t)  =  lim  in  q.m.  'V'  J A*  aAiv)  <f>i(t)  i  6  [o,  b\  (3.7) 

TV— >•  oo  .  f  v 

i=l 

with 

E[aiaj}  =  5ij  (3.8) 


and 

(  fa  )c2[a,b]  =  ja  fa(t)  fa^)  dt  =  SL  (3‘9) 

if  and  only  if  the  {(pi,  <p2, . . .}  are  the  orthonormal  eigenfunctions  and  the 
{ Ai ,  A2, . . .}  are  the  corresponding  eigenvalues  of  the  integral  operator  with  kernel 

i-e-, 

fb 

/  R(t,s)  (pi(s)  ds  =  Xi<f>i(t)  t  E  [a,  b\  i  =  1,2, ...  (3.10) 

J  a 

In  that  case,  the  series  (3.7)  converges  uniformly  on  [ a,b ]. 


Proof  See  Appendix  D  and  [8,  96,  166]. 


Remark  3.2.6  The  coefficient  functions  (pi  that  correspond  to  non-zero  eigenval¬ 
ues  A i  (and  hence  that  contribute  to  the  convergent  series  in  (3.5)  and  (3.7))  are 
called  the  empirical  eigenfunctions  of  the  ensemble.  They  form  an  orthonormal  ba¬ 
sis  for  the  subspace  of  C2  [a,  b]  to  which  all  members  of  the  ensemble  belong  (except 
for  a  set  of  measure  zero;  see  Fact  3.2.24).  n 

Remark  3.2.7  The  uncorrelated  random  variables  {ai,  o2, . . .}  are  given  by 

I -  — 1  fb 

ai(uj)  =  (\J  Aj)  /  (pi{t)  X(u,t)  dt  i  —  1,2, ...  (3-11) 

J  a 

and  form  the  desired  orthonormal  basis  for  %x  ■  □ 
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Remark  3.2.8  Each  eigenvalue  \  can  be  interpreted  as  the  mean  energy  of  the 
signals  in  the  ensemble  projected  onto  the  fa-axis  in  function  space  £2  [a,  b\  (see, 
e.g.,  [118]  for  the  calculation) ,  i.e., 

\  —  E  |  ( fa,  Xt  )£2[a>b]  |  i  =  1,2,...  (3-12) 

This  interpretation  justifies  the  ranking  of  terms  in  the  series  (3.7)  by  relative 
energy  contribution.  □ 

Two  Parameter  Processes 

To  model  the  two-parameter  case,  consider  the  ensemble  of  signals 

{X(u,  •,  •)  :  [0,  00)  x  V  -G  Rn,  uj  G  12}  (3.13) 

where  T>  is  bounded.  Each  member  of  the  ensemble  is  a  function  in,  respectively, 
£2(0,  00)  and  £2  (£>),  for  fixed  x  and  fixed  t.  As  before,  we  assign  a  probabilistic 
structure  and  assume  that  the  process  {XtjX,  t  G  [0,  00),  x  G  T>}  is  second-order 
and  has  zero  mean,  i.e.,  E  [ Xt  x  ]  =  0.  We  also  assume  that  the  process  is  wide- 
sense  stationary  and  ergodic  with  respect  to  t.  It  then  has  a  spatial  covariance 
function  given  by 

R(x,  y)  =  E[  XtiX  Xjy  ]  =  ,hm  ^  ^  XtjX  Xjy  dt  (3.14) 

The  spectral  decomposition  and  spatial  POD ,  respectively,  are  given  by 

OO 

R(x>  v)  =  £  <MX)  Mv)  (3-!5) 

i=l 

where  convergence  is  absolute  and  uniform  on  T>  x  V,  and 

N 

X(u>,t,x)=  lim  in  q.m.  J  Aj  a^uj ,  t)  Mx)  t  G  [0,  00),  iGf>  (3.16) 

N—>oo  v 

i=l 

where  convergence  is  uniform  on  [0,  00)  xD. 
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Remark  3.2.9  The  coefficient  functions  <fi  that  correspond  to  non-zero  eigenval¬ 
ues  A i  are  called  the  spatial  empirical  eigenfunctions  of  the  ensemble.  They  are  the 
unit  magnitude  solutions  of  the  family  of  integral  equations 

f  R(x,y)(j)i(y)dy  =  Xi(j)i(x)  x<EV  i  =  1,2,...  (3.17) 

Jv 

and  form  an  orthonormal  basis  for  the  subspace  of  C2  (T>)  to  which  all  members  of 
the  ensemble  belong  for  each  fixed  t  (except  for  a  set  of  measure  zero).  □ 

Remark  3.2.10  The  random  functions  {oi,  02, . . .}  are  given  by 

afluofl)  =  (\f\)  f  </>i(x)X(u,t,x)dx  i  —  1,2,...  (3.18) 

v  Jv 

They  are  stochastic  processes,  inherit  ergodicity  from  XttX,  and  are  uncorrelated  in 
the  sense  that 

E[aflt)  aj(t)}  =  lim  [  aflt)  aAt)  dt  —  5ij  (3.19) 

T->oo  J 0 

□ 

Remark  3.2.11  Sirovich  [147]  refers  to  the  spatial  empirical  eigenfunctions 
{(f>  1,  02, . . .}  as  coherent  structures.  This  terminology  stems  from  the  interpretation 
of  the  signals  as  realizations  of  a  physical  flow  in  time  and  space  (e.g.,  fluid  momen¬ 
tum,  heat).  The  empirical  eigenfunctions  then  correspond  to  physically  manifested, 
coherent  spatial  structures  in  the  flow.  □ 

Remark  3.2.12  Each  eigenvalue  A,;  can  be  interpreted  as  the  mean  energy  of  the 
signals  in  the  ensemble  projected  onto  the  fa-axis  in  function  space  £2  (T>),  and 
equivalently,  by  ergodicity  with  respect  to  time,  as  the  average  relative  time  spent 
by  signals  in  the  ensemble  along  the  fa-axis,  i.e., 

A i  E  |  ( 4>i,  X^  x ) C2(v)  I  ~T  J  ^  ) Co(v)  dt  l  1)2,... 

(3.20) 
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Remark  3.2.13  We  note  that  since  the  time  domain  is  unbounded  (which  typically 
models  the  evolution  of  a  dynamical  system),  there  is  no  “ temporal  POD”  for  this 
two-parameter  case.  This  is  due  to  the  fact  that  the  integral  operator  with  kernel 
R(t,  s )  is  compact  if  and  only  if  its  domain  is  bounded,  and  hence  has  no  spectral 
decomposition  in  the  case  of  unbounded  domain.  □ 

Sampled  Data  Processes 

In  most  practical  applications,  we  work  with  processes  that  are  sampled  in  time 
or  space  or  both.  Consider  the  ensemble  of  sampled  signals 

{X{u,  •)  :  {1,  2, . . .}  — »  !Rn,  u  E  Sd}  (3.21) 

Each  member  of  the  ensemble  is  a  vector  in  t2.  As  usual,  we  assume  that  the 
process  {Xk,  k  =  1,2,...}  is  second-order,  and  without  loss  of  generality,  has  zero 
mean,  i.e.,  E  [  Xk  ]  =  0.  The  discrete  covariance  function  is  given  by 

R(j,k)  =  E[XjXkr]  (3.22) 

which,  in  the  case  that  Xk  is  scalar  valued,  can  be  written  as  a  matrix  R  =  [R]jk  = 
[i?  (j,  k)]  (if  Xk  is  vector  valued  then  R  can  be  written  as  a  fourth-order  tensor). 
The  matrix  (tensor)  R  is  real,  symmetric,  and  non-negative  definite. 

The  spectral  theorem  (see,  e.g.,  [154])  states  that  every  real  symmetric  matrix 
R  can  be  diagonalized  by  an  orthogonal  matrix,  i.e.,  there  exists  an  orthogonal 
matrix  <f>  and  a  real  diagonal  matrix  A  =  diag  (Ai, . . . ,  An)  such  that  R  =  $  A  $T. 
We  write  the  spectral  decomposition  of  R  in  (3.22)  as 

OO 

R  =  E  Ai  &  (3.23) 

i= 1 

where  each  vector  <f>i  is  the  i-th  column  of  $. 
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Remark  3.2.14  The  operator  given  by  multiplication  by  the  covariance  matrix  is 
always  compact,  even  with  an  infinite  number  of  rows  and  columns.  We  need  not 
worry  about  boundedness  of  the  time  domain  here  (i.e.,  even  though  k  =  1,2,..., 
the  spectral  decomposition  exists).  □ 

Remark  3.2.15  Because  the  covariance  matrix  R  is  non-negative  definite,  the 
eigenvalues  {Ai,  A2,  -  -  -}  are  non-negative.  By  convention  they  are  ordered  such 
that  A i  >  Aj+i .  □ 

The  sampled  data  POD  is  a  direct  consequence  of  the  spectral  theorem. 

Theorem  3.2.16  (Sampled  Data  POD)  Let  {Xk,  k  =  0, 1, . . .}  be  a  zero-mean 
scalar-valued  discrete-parameter  second-order  stochastic  process  with  covariance 
matrix  R.  The  process  Xk  has  an  orthogonal  decomposition 

N 

X(u,  k )  =  lim  in  q.m.  V'  J A,;  ai(u>)  (<j>i)k  k  =  0, 1, . . .  (3.24) 

iV— >•  00  .. f  v 

1=1 

with 

E[aiaj}  =  5ij  (3.25) 

and 

( fa,  <j>j  )  =  faT  fa  =  8ij  (3.26) 

if  and  only  if  the  {fa,  (f> 2, . . .}  are  the  orthonormal  eigenvectors  and  the  { Ai ,  A2, . . .} 
are  the  corresponding  eigenvalues  of  the  matrix  R,  i.e., 

Reft  =  Xi  <f>i  i  =  1,2,...  (3.27) 

Proof  See  Appendix  D.  ■ 

Remark  3.2.17  The  vectors  fa  corresponding  to  non-zero  eigenvalues  A*  are  called 
the  empirical  eigenvectors  of  the  ensemble.  They  form  an  orthonormal  basis  for 
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the  subspace  of  I2  to  which  all  members  of  the  ensemble  belong  (except  for  a  set  of 
measure  zero).  □ 

Remark  3.2.18  The  uncorrelated  random  variables  {ai,  02, . . .}  are  given  by 

/  /—\-i  °° 

a*(w)  =  (  V  Ai )  Y  {<f>i)k  X(u,k)  (3.28) 

and  form  the  desired  orthonormal  basis  for  TLx  ■  □ 

Remark  3.2.19  It  is  often  convenient  to  express  the  scalar-valued  sampled  data 
process  { Xk ,  k  —  0, 1, . . .}  as  a  random  vector  X  =  [Xi,  X2, . .  .]T.  In  this  case  the 
covariance  matrix  is  given  by  R  =  E  X  XT  and  the  sampled  data  POD  (3.24) 
is  written  compactly  as 

X  =  $A  1/2a  =  $6  (3.29) 

where  $  =  =  [(f>i)k]  is  an  orthogonal  matrix  whose  columns  are  the  empirical 

eigenvectors,  A  =  diag  (Ai,  A2 ,...),  and 

a  =  [01,  a2, . .  .]T  =  A_1/2  $T  X  (3.30) 

and  b  =  A1/2  a  are  random  vectors.  The  mean  energy  interpretation  of  the  eigen¬ 
values  is  expressed  by 

\i  =  X\2]  i  =  1,2,...  (3.31) 

□ 

Two  Parameter  Sampled  Data  Processes 

The  typical  situation  that  arises  in  dynamical  systems  applications  is  that  of  an 
ensemble  of  signals  that  evolves  on  a  time  continuum  but  whose  spatial  domain 
has  been  discretized.  Moreover  it  is  usually  the  case  that  the  spatial  domain  is 
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bounded,  resulting  in  a  finite  number  of  samples  in  the  spatial  parameter  (e.g.,  time 
evolution  of  the  temperature  field  in  a  solid  body  discretized  via  finite-elements). 
Consider  an  ensemble  of  such  signals 

{X(u,  •,  •)  :  [0,  cx))  x  {1, . . . ,  n}  — >  1R”,  u  G  0}  (3.32) 

We  assume  as  before  that  the  process 

{Xt,k,  t  e  [0,  cx)),  k  =  1, . . .  ,n} 

is  second-order  and  has  zero  mean,  i.e.,  E  [  Xtk  ]  =  0.  We  also  assume  that  the 
process  is  wide-sense  stationary  and  ergodic  with  respect  to  t.  The  discrete  spatial 
covariance  function  is  given  by 

R  (j.  k)  =  E[  Xtj  Xtk )  =  lim  I  Xtj  Xtk  dt  (3.33) 

T-t  oo  J  o 

which  as  before  can  be  expressed  as  a  matrix  R  =  [R]]k  =  [R  ( j ,  k)]  with  spectral 
decomposition  R  =  Y^=  i 

The  sampled-data  spatial  POD  is  given  by 

n  _ 

X(u},t,k)  =  yXiai(u,t)  (<j>i)k  k  =  1, . . .  ,n  (3.34) 

i= 1 

Remark  3.2.20  The  vectors  &  corresponding  to  non-zero  eigenvalues  A*  are  called 
the  spatial  empirical  eigenvectors  of  the  ensemble.  They  are  the  unit  length  solu¬ 
tions  of  (3.27)  (i.e.,  unit  eigenvectors  of  the  matrix  R)  and  form  an  orthonormal 
basis  for  the  subspace  of  1R"  to  which  all  members  of  the  ensemble  belong  for  each 
fixed  t  (except  for  a  set  of  measure  zero).  □ 

Remark  3.2.21  The  random  functions  {ai, . . .  ,an}  are  given  by 

_1  n 

£  (&)*  X(u,t,k)  (3.35) 

k= 1 

They  are  stochastic  processes,  inherit  erg odicity  from  Xkk,  and  are  uncorrelated  in 
the  sense  of  (3.19).  □ 
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Remark  3.2.22  In  applications,  the  continuous-time  scalar-valued  sampled  spa¬ 
tial  data  process 

{Xt,k,  t  e  [0,  cx)),  k  =  l,...,n} 

is  often  expressed  as  a  one-parameter  vector  process  Xt  =  [Xt}i, . . . ,  Xt:n}T ■  In  this 
case  the  spatial  covariance  matrix  is  given  by 


R  =  E 


x,xj 


=  lim  /  Xt  X  dt 

T^oc  J0  1 


(3.36) 


and  the  sampled  data  spatial  POD  (3.3 4)  is  written  compactly  as 


Xt  =  $  A1/2  a(t)  =  $  b(t) 


(3.37) 


where  $  =  =  [(4>i)k\  is  an  orthogonal  n  x  n-matrix  whose  columns  are  the 

spatial  empirical  eigenvectors,  A  =  diag  (Ai, . . . ,  A„);  and 

a(t)  =  K  (*),...,  an(t)\J  =  A"1/2  $T  X*  (3.38) 


and  b(t )  =  A1/2  aft)  are  vector  processes.  The  mean  energy  and  avert 
duration  interpretations  of  the  eigenvalues  are  expressed,  respectively,  by 


=  E 

H 

-€ 

2  ' 

=  lim  fT 

H 

-e 

i 

T->oo  J o 

dt 


i  =  1, 


,  n 


(3.39) 


□ 


Remark  3.2.23  The  relationship  between  the  POD  of  a  second-order  stochastic 
process  and  the  PCA  of  a  matrix  valued  signal  (see  Section  2.3)  becomes  apparent 
by  observing  that  the  infinite  time  horizon  Gramian  matrix  W2  given  by  (2.29) 
for  signal  X(t)  corresponds  to  the  two-point  spatial  covariance  matrix  R.  The 
sampled- data  spatial  POD  (3.34)  then  the  same  as  the  PCA  (2.30)  where  the 
square  root  of  the  Xt  have  been  subsumed  into  the  coefficient  functions  ai(t).  □ 
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3.2.2  Properties 


There  are  two  properties  of  the  POD  in  which  we  are  most  interested  here.  The 
first  says  that  the  POD  does  in  fact  produce  a  representation  that  is  capable  of 
describing  all  of  the  observed  phenomena  from  which  it  was  derived.  The  other 
is  the  main  property  of  interest  in  the  context  of  model  reduction.  It  says  that 
the  POD  is  optimal,  or  most  efficient,  in  terms  of  modeling  the  signal  set  with  the 
fewest  number  of  modes. 

Recall  the  ensemble  of  signals 

{X(u,  •)  :  [a,  b]  — »  ]Rn,  co  G  0}  (3.40) 

and  suppose  that  the  associated  POD  orthonormal  basis  set  is  {<fi,  02, . . .}.  We 
define  the  span  of  the  empirical  basis  as  the  collection  of  functions  that  can  be 
represented  by  a  convergent  sequence  of  a  linear  combination  of  empirical  eigen¬ 
functions,  i.e., 

ioo  OO  'I 

ai  4>i  :  ai  <  00  f  (3.41) 

i= 1  i= 1  ) 

Similarly,  the  span  of  all  members  of  the  ensemble  is  given  by 

ioo  OO  ^ 

(3.42) 

i=  1  i=  1  J 

It  is  shown  in  [15]  that 

Fact  3.2.24  The  sets  S $  and  Sx  are  equivalent  with  the  exception  of  a  set  of 
measure  zero.  □ 

Remark  3.2.25  Thus,  every  member  of  the  ensemble  that  generated  the  empirical 
eigenfunctions,  and  linear  combinations  thereof,  can  be  represented  by  a  convergent 
series  of  a  linear  combination  of  the  empirical  eigenfunctions.  □ 
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The  optimality  of  the  POD  in  terms  of  modeling  a  signal  from  among  an 
ensemble  of  signals  is  expressed  by  the  following  result  (see,  e.g.,  [15,  147]). 

Proposition  3.2.26  (Optimality  of  the  POD)  Consider  the  ensemble 

{X{u,  •)  :  [a,  b]  — »  ]Rn,  co  G  0} 

and  let  {(j) i,  02, . . .}  be  the  empirical  eigenfunctions  with  corresponding  eigenvalues 
{ Ai,  A2,  •  •  Let 

{X(u>,t),  u  e  fl,  t  e  [a,  b]} 
be  a  member  of  the  ensemble  with  POD 

OO 

X(Q,  t)  =  Y  bi(u)  <f>i(t)  (3.43) 

i= 1 

where  the  eigenvalues  have  been  subsumed  into  the  random  coefficients,  i.e.,  for 
each  i,  bi(u)  =  \/X ja^ca).  Let  {ifi,  "02,  •  •  •}  be  an  arbitrary  orthonormal  set  such 
that  for  some  random  variables  {ci,  c2, . . .} 

OO 

X(u,t)  =  q(o>)  ipi(t)  (3.44) 

i= 1 

Then  for  each  truncation  index  N 

N  N  N 

y  B[K*w,>|]  =  y  A.>y  B[i(*,xt>i]  (3.45) 

i=  1  i=  1  i= 1 

and  equivalently 

N  N  N 

Y  E  [  II  bt  ||2  ]  =  Y  A*  >  E  E  [  II  Ci  II2  ]  (3.46) 

1=1  i=l  i=l 

□ 

Remark  3.2.27  Thus,  for  any  given  number  of  modes  N,  the  projection  of  any 
member  of  the  ensemble  onto  the  subspace  spanned  by  the  most  energetic  N  mem¬ 
bers  of  the  empirical  basis  will  contain  more  energy,  on  average,  than  the  projection 
onto  the  subspace  spanned  by  N  members  of  any  other  orthonormal  basis.  □ 
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Remark  3.2.28  The  optimality  property  may  also  be  interpreted  as  a  minimiza¬ 
tion  of  the  error,  on  average,  between  members  of  the  ensemble  and  the  truncated 
orthogonal  expansion.  To  see  this,  observe  that  the  minimum  of  the  mean-squared 
error 


E 


-X*  -  E  Ci(u) 

i= 1 


E 


N 

+  E  ^ 


N 

2  J2E[(Ci^,Xt)} 


(3.47) 


i=  1  i=l 

is  achieved  when  YhiLi  E[{clfji,Xt)]  is  maximized.  Equation  (3.45)  gives  the 
empirical  basis  as  the  maximizing  orthonormal  set.  □ 


Remark  3.2.29  There  exists  no  explicit  error  bound,  e.g.,  corresponding  to  (1.1), 
in  terms  of  the  eigenvalues  or  otherwise.  □ 


Remark  3.2.30  In  the  two-parameter  case,  the  optimality  properties 
(3.45)  and  (3.46)  are  expressed,  respectively,  as 

N  N  N 

£  E  [  I  <  )  1  ]  =  £  A<  >  £  E  ( I  <*M,  x>.* )  1 1  (3.48) 

i= 1  i=  1  i= 1 

and  equivalently 


N 


N  N 

£*<>££ 
i= 1  i=  1 


(3.49) 


for  each  N,  where 


X(u,t,x)  =  bi{v,t)  4>i(X)  =  E  Ci(W>0  ^i(x) 


i=  1 


i= 1 


(3.50) 

□ 
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3.2.3  Computation 


The  computational  aspects  of  the  POD  are  crucial  to  analyzing  its  advantages 
and  shortcomings  as  a  model  reduction  methodology.  We  consider  the  issues  of 
centering,  practical  computation  of  the  empirical  basis,  and  practical  derivation  of 
low-order  models  for  nonlinear  control  systems. 

Centering  and  Zero-Mean  Processes 

Consider  the  ensemble, 


{Z(u,  •)  :  T  ->  En,  u  e  0}  (3.51) 

It  is  often  the  case  that  members  of  the  ensemble  are  very  similar  to  each  other 
in  the  sense  of  (3.3)  because  their  respective  differences  are  small  in  magnitude 
compared  with  the  magnitude  of  the  signals  themselves.  This  centering  problem 
is  addressed  simply  by  subtracting  out  the  average  signal  p(t)  —  E[Zt]  from  each 
ensemble  member,  i.e., 

Xt  =  Zt-  /. t(t)  (3.52) 

to  give  the  zero- mean  process  Xt. 

Remark  3.2.31  The  process  Xt  represents  the  deviation  or  fluctuation  from  the 
mean  signal.  The  recentering  minimizes  the  effect  of  noise  and  numerical  error  in 
computations  and  generates  a  POD  basis  set  and  corresponding  ranking  of  modes 
that  more  accurately  reflects  the  differences  in  energy  content  between  signals.  This 
justifies  our  emphasis  on  zero-mean  processes  in  Section  3.2.1.  □ 
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Remark  3.2.32  The  centering  is  similarly  applied  to  the  two-parameter  and  sam¬ 
pled  data  cases,  i.e., 


Xt,x 

=  ZtiX  -  n(t,  x) 

=  E[Zt,x] 

Xk 

=  Zk  -  n(k) 

p(k) 

=  E[Zk } 

(3.53) 

Xt,k 

=  Zt,k  -  n(t,  k ) 

p(t,  k ) 

=  E[Zt,k } 

□ 

Computing  the  Empirical  Basis 

Here  we  present  standard  methods  for  computing  the  empirical  basis.  Consider 
the  typical  dynamical  systems  application  where  Xt  =  \XIA. . . . ,  Xkn]J  is  a  zero- 
mean  vector  process  representing  the  fluctuation  from  the  mean  of  a  physical  flow 
in  time  (continuous)  and  space  (discretized),  or  possibly  some  other  multi- variable 
state  evolution  in  continuous  time.  We  make  the  standard  assumptions  as  usual. 
We  need  to  compute  an  approximation  to  the  spatial  covariance  matrix 

R  =  E  \  Xt  Xj  1  =  lim  l-  fT  Xt  Xj  dt  (3.54) 

L  J  T->o o  T  Jo 

This  is  accomplished  by  sampling  the  flow  Xt  at  times  {tA,t2,  •  •  •},  i.e.,  capturing 
“snapshots” 

{x(n),x(f2),...} 

The  times  can  be  equally  spaced  by  a  fixed  interval  r,  i.e.,  tv  =  (v—T)r  (mimicking 
the  action  of  a  strobe).  We  define  the  approximation 

i  M 

n  =  aI-SL  m  £  x{u)  x'iu)  (3'65) 

For  all  practical  purposes,  we  must  make  another  approximation  and  use  only  a 
fixed  finite  number  M  of  samples,  i.e., 

i  M 

Km=m1 lx(C)XT(U)  (3,56) 
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Remark  3.2.33  The  approximation  improves  as  the  number  of  samples  M  in¬ 
creases.  The  actual  number  of  samples  that  are  captured  and  used  will  depend  on 
practical  considerations.  However,  it  is  reasonable  to  assume  that  M<n.  □ 

Remark  3.2.34  In  the  applied  literature,  one  often  sees  equation  (3.56)  written 
as 

RM  =  fj  S2T  (3.57) 

where 

5  =  [SL  =  IMQ)  (3.58) 

is  a  n  x  M  matrix  called  the  data  matrix.  Actually,  it  is  common  for  authors  to 
ignore  the  1/M  factor  and  the  fact  that  an  approximation  is  being  made,  and  to 
write  R  =  SST.  Also,  this  procedure  is  commonly  mistaken  for  the  “method  of 
snapshots,  ”  which  refers  to  something  somewhat  different,  to  be  described  shortly. 
□ 

Remark  3.2.35  The  approximate  spatial  covariance  has  only  M  non-zero  eigen¬ 
values  and  hence  M  approximate  empirical  eigenvectors.  Thus,  the  span  of  the 
empirical  basis  is  at  most  an  M -dimensional  subspace  oflRn.  □ 

Now  we  can  use  standard  matrix  algebra  algorithms  to  compute  the  spectral 
decomposition 

RM  =  $  A$t  (3.59) 

The  approximate  empirical  eigenvectors  are  given  by  the  first  M  columns 
{0  i, . . . ,  0m}  of  the  n  x  n  orthogonal  matrix  $.  They  correspond  to  the  M  non¬ 
zero  eigenvalues  { Ai , . . . ,  Am},  he.,  the  non-zero  diagonal  entries  of  A  (in  decreasing 
order) . 
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Remark  3.2.36  In  the  applied  literature,  the  spectral  decomposition  of  Rm  is  usu¬ 
ally  accomplished  by  computing  the  singular  value  decomposition  (SVD)  of  the  data 
matrix  H,  i.e., 

£  =  $  £  \I/T  (3.60) 


where  $  is  the  nxn  orthogonal  matrix  containing  the  empirical  eigenvectors,  £  is 
n  x  M  given  by 


£ 


diag  (cr i , . . .  ,o-M) 

(M  diag  (Ai, . . . ,  AM))1/2 

0 

0 

(3.61) 


and  is  M  x  M  and  orthogonal.  Thus,  the  empirical  eigenvectors  are  computed 
correctly  via  the  SVD. 

The  ordering  of  the  <Ji  remains  the  same  as  that  of  the  A*  (and  hence  also  the 
ranking  of  modes).  However,  the  precise  meaning  via  relative  energy  resides  with 
the  eigenvalues.  Sometimes  authors  have  ignored  the  square  root  operation  and 
mistakenly  based  their  truncation  analysis  on  the  singular  values.  Moreover,  it  is 
common  for  authors  to  claim  that  the  POD  and  the  SVD  are  equivalent  procedures, 
which  ignores  the  underlying  theory,  assumptions,  and  approximations  pertaining 
to  the  POD.  It  is  more  accurate  to  think  of  the  SVD  as  a  tool  that  can  be  used  in 
practical  computation  of  the  POD.  □ 


Computational  difficulties  occur  when  the  dimension  n  of  Xt  is  large,  e.g.,  due 
to  a  high  resolution  spatial  discretization.  In  particular,  this  forces  the  spectral  de¬ 
composition  of  a  large  matrix,  possibly  requiring  very  high  computational  expense, 
and  possibly  leading  to  the  accumulation  of  large  numerical  errors.  Sirovich  [147] 
introduced  a  method  to  deal  with  this  situation,  coining  the  names  “method  of 
snapshots”  and  “method  of  strobes”.  It  is  a  method  for  computing  the  empirical 
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eigenvectors  without  forming  and  decomposing  the  very  large  two-point  spatial 
covariance  matrix. 

Assume  that  the  snapshots  X(tv)  are  linearly  independent  vectors  in  Rn,  i.e., 
the  data  matrix  H  is  has  full  column  rank,  and  define  the  M  x  M  matrix 

Cm  =  TstS  (3.62) 

Let  {-01, . . .  ,0m}  and  {pi, . . .  ,//m}  be,  respectively,  the  eigenvectors  and  eigen¬ 
values  of  Cm,  he., 

=  =  i  e  M  (3.63) 

A  simple  calculation  reveals  that  if  we  define 

4>i  =  ieM  (3.64) 

then 

IH  (t>i  =  2  Ht  =  Rm  ieK  (3.65) 

Remark  3.2.37  Thus,  the  eigenvalues  fit  of  the  M  x  M  matrix  Cm  correspond 
exactly  to  the  non-zero  eigenvalues  A  *  of  the  n  x  n  matrix  Rm-  The  relation¬ 
ship  between  the  empirical  eigenvectors  <f>i  and  the  eigenvectors  0*  of  Cm  is  given 
by  (3.64),  he.,  the  empirical  eigenvectors  are  linear  combinations  of  the  snapshots 
X(t„).  □ 

Remark  3.2.38  The  advantage  of  the  method  of  snapshots  is  that  we  compute  the 
empirical  eigenvectors  via  spectral  decomposition  of  the  M  x  M  matrix  Cm  instead 
of  the  much  larger  n  x  n  matrix  Rm-  □ 

Remark  3.2.39  In  the  literature,  sometimes  Cm  is  referred  to  as  the  covariance 
matrix.  It  is  actually  a  temporal  covariance.  Time  sampling  produces  a  compact 
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operator  which  admits  a  spectral  decomposition  yielding  temporal  empirical  eigen¬ 
vectors,  which  are  related  to  the  spatial  empirical  eigenvectors  by  (3.64)-  □ 

Deriving  the  Reduced-Order  Model 

Here  we  describe  the  typical  procedure  for  deriving  a  reduced-order  model  from 
the  full-order  model.  Our  presentation  is  in  the  context  of  finite-dimensional  con¬ 
trol  systems,  although  it  generalizes  easily  to  the  infinite-dimensional  setting  (see, 
e.g.,  [118,  119]). 

We  assume  that  the  full-order  state-space  model  has  been  derived  via  first  prin¬ 
ciples  or  empirical  analysis  or  both,  with  autonomous  state  equation  and  output 
equation,  respectively, 

x  =  f(x,u)  (3.66) 

y  =  h{x)  (3.67) 

where  x  G  IR"  represents  local  coordinates  for  the  full-order  state.  It  is  also  possible 
that  the  actual  physical  system  being  modeled  is  available  for  experimentation. 

An  ensemble  of  signals  is  needed  to  compute  the  empirical  basis.  In  order  for 
the  POD  to  yield  an  efficient  basis  for  an  orthogonal  expansion,  the  signals  in  the 
ensemble  should  represent  or  capture  the  essential  system  behavior.  This  means 
that  the  modeler  must  choose  one  or  more  sets  of  inputs  and  initial  conditions  to 
produce  what  he  deems  to  be  the  representative  system  response.  An  ensemble 
of  state  trajectories  is  then  generated  via  numerical  simulation  of  the  state  equa¬ 
tion  (3.66),  or  possibly  via  experimentation  if  practical.  A  suitable  set  of  sampling 
times  must  be  chosen  to  determine  the  snapshot  data  which  forms  the  data  matrix. 

Given  the  data  matrix,  the  empirical  eigenvectors  {(j) i, . . .  ,4>m}  corresponding 
to  the  non-zero  eigenvalues  {Ai, . . . ,  Am}  are  computed  via  the  POD,  whether  by 


direct  SVD  or  the  Sirovich  method  of  snapshots,  and  arranged  into  the  matrix 

$  =  [01;...,0m]  G  RnxM  (3.68) 

The  relative  magnitudes  of  the  eigenvalues  are  then  analyzed  in  order  to  choose 
a  truncation  index  k  <  M  <C  n  such  that,  from  the  viewpoint  of  the  modeler,  the 
resulting  model  order  is  sufficiently  low  while  retaining  sufficiently  high  signal  en¬ 
ergy  on  average,  i.e. ,  provides  a  favorable  tradeoff  between  fidelity  and  complexity. 
This  analysis  yields  a  truncated  transformation  matrix 

$fc  =  [0r,..s,0fe]  GRnxfc  (3.69) 

The  reduced-order  state  z  G  Rfc  and  the  approximate  reconstruction  of  the 

original  full-order  state  x  G  Rn,  respectively,  are  defined  by 

z  =  $fcT  x  (3.70) 

x  =  $kZ  (3-71) 

The  reduced  state  equation  is  computed  via  Galerkin  projection 

z  =  x  =  /  ( x ,  u)  «  $fcT  /  ($fc  z,  u)  (3.72) 

yielding  the  reduced-order  control  system 

Z  =  f(z,u )  (3.73) 

y  =  h(z)  (3.74) 

where  the  reduced  system  map  /  :  Rfc  x  U  — *  RA  and  reduced  output  map  h  : 
R/;:  — >  Rp  are  dehned,  respectively,  by 

f(z,u)  =  $kTf($kz,u)  h(z)  =  h($kz)  (3.75) 
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and  y  e  Rp  is  the  approximate  reconstruction  of  the  output. 

The  reduced  state  equation  (3.73)  is  numerically  integrated  using  initial  reduced- 
order  state  z(Q)  =  <f>fcTa;(0)  where  x(0)  corresponds  to  the  desired  initial  full-order 
state.  This  produces  the  reduced-order  state  trajectory  z(t),  from  which  an  ap¬ 
proximation  to  the  full-order  trajectory  is  reconstructed  via  x{t )  =  $fcT2:(t). 

Remarks 

We  conclude  this  section  with  some  remarks  on  the  advantages  and  drawbacks 
of  the  POD  as  a  tool  for  computing  a  coordinate  transformation  for  state-space 
model  reduction. 

•  The  modeler  has  a  great  deal  of  discretion  in  determining  the  reduced-order 
model.  He  chooses  the  ensemble  of  signals  (via  choice  of  inputs  and  initial 
conditions),  sampling  times,  and  truncation  index.  While  the  POD  is  a 
natural  tool  for  efficiently  representing  signals  in  an  ensemble,  it  is  merely 
part  of  an  ad-hoc  procedure  for  reducing  the  order  of  a  dynamical  system. 
It  does  not  work  directly  with  the  system  map  /. 

•  The  POD  is  optimal  for  efficiently  representing  signals  that  belong  to  an 
ensemble,  e.g.,  state  trajectories  generated  via  simulation  of  the  ODE  (3.66) 
for  a  chosen  set  of  inputs.  However,  because  the  admissible  controls  consti¬ 
tute  a  much  larger  set,  the  resulting  family  of  ODEs  will  produce  trajectories 
that  do  not  belong  to  the  ensemble.  The  efficiency  of  the  POD  in  terms  of 
representing  all  possible  state  trajectory  signals  is  unknown. 

•  There  have  been  no  results  of  which  we  are  aware  regarding  a  rigorous  or 
systematic  methodology  for  generating  a  representative  ensemble  of  signals 
to  characterize  the  state  response  of  a  nonlinear  control  system. 
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•  Snapshot  data  may  fail  to  capture  dynamical  effects  occurring  at  widely 
differing  time  scales. 

•  The  state-to-output  relationship  y  =  h(x )  is  not  used  in  determining  the 
empirical  basis.  Since  the  output  consists  of  variables  of  particular  interest, 
it  would  appear  that  ignoring  this  relationship  is  to  the  method’s  detriment. 

3.2.4  Applications 

The  POD  has  become  prominent  as  a  tool  for  complexity  reduction  during  the 
1980s  and  1990s,  finding  application  in  a  wide  variety  of  areas.  Here  we  describe 
some  examples  in  order  to  illustrate  its  capabilities  for  efficient  representation  and 
the  ad-hoc  nature  of  the  procedure  as  applied  to  model  reduction  for  nonlinear 
dynamical  systems.  We  emphasize  applications  to  order  reduction  for  RTP  models 
in  order  to  provide  background  and  motivation  for  subsequent  material. 

The  capabilities  of  the  POD  become  apparent  in  the  context  of  data  compres¬ 
sion  for  image  processing,  for  which  the  POD  is  a  natural  tool.  For  example, 
in  [146],  the  authors  apply  the  POD  to  compress  the  amount  of  data  needed  to 
reconstruct  pixelized  images  of  human  faces  (27  x  2'  pixels  with  28  gray  levels).  In 
their  study,  linear  combinations  of  40  dominant  “empirical  eigenfaces”  are  capable 
of  representing  face  images,  both  within  and  outside  of  the  original  population 
(115  faces),  to  within  3%  error,  thus  reducing  the  dimension  of  the  representation 
space  from  222  to  40. 

Various  studies  have  demonstrated  the  effectiveness  of  the  POD  as  a  tool  for  de¬ 
riving  low-order  characterizations  of  a  spatially  distributed  flow  v(t,  x)  representing 
the  time  evolution  of  some  physical  phenomenon.  Examples  include  applications 
to  pipe  flow  in  a  wall  region,  [11],  Rayleigh-Benard  convective  flow  [40],  turbulent 
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channel  flow  [12],  and  vibration  of  a  thin  membrane  in  a  stadium  [19].  In  these 
studies,  the  ensemble  of  signals  {v(t,  a;)}  is  a  collection  of  realizations  for  the  time- 
varying  flow  field,  i.e.,  the  unique  solutions  of  a  family  of  initial  value  problems. 
In  each  study  it  is  shown  that  the  pre-computed  flow  can  be  represented  with 
high  accuracy  using  a  number  of  empirical  eigenfunctions  that  is  relatively  small 
compared  with  the  discretization  resolution  used  to  simulate  the  original  evolution 
equations  (O  (103)  reduction  is  consistently  achieved).  Moreover,  the  structural 
aspect  of  the  eigenfunctions  is  observed,  e.g.,  as  rolls  and  shearing  motions  in  [12]. 

However,  it  is  important  to  note  that,  in  these  examples,  low-order  approxi¬ 
mations  to  the  original  evolution  equations  are  not  derived.  Rather,  the  focus  is 
on  deriving  low-order  representations  of  pre-computed  flow  fields,  a  task  naturally 
suited  for  the  POD.  Thus,  these  applications  do  not  necessarily  fall  within  the 
realm  of  what  we  consider  to  be  model  reduction.  The  subject  of  deriving  low- 
order  evolution  equations  for  spatially  distributed  flows  using  the  POD  basis  is 
covered  in,  e.g.,  [148],  but  examples  are  not  offered  and  computational  issues  such 
as  those  pointed  out  in  Section  3.2.3  are  not  addressed. 

During  the  1990’s,  the  POD  has  appeared  in  various  ad-hoc  methodologies 
related  to  the  control  of  state-space  models  for  dynamical  systems.  Strategies 
for  control  of  turbulent  flows  are  proposed  in  [99],  where  the  authors  use  low- 
dimensional  models  and  knowledge  of  the  extracted  coherent  structures  to  deter¬ 
mine  how,  when,  and  where  to  interfere  with  the  flow  in  the  boundary  layer.  In  [25], 
a  nonlinear  feedback  law  is  constructed  which  requires  information  only  about  the 
dominant  empirical  eigenfunctions  without  using  the  original  nonlinear  model. 

There  has  been  much  recent  activity  in  the  development  of  models  to  be  used 
for  control  of  the  temperature  distribution  on  a  semiconductor  wafer  in  a  RTP 
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chamber.  Dynamic  models  of  heat  transfer  in  a  generic  RTP  system  with  5  lamp 
banks  are  presented  in  [5,  6].  The  fnll-order  model  has  116  states,  each  represent¬ 
ing  the  temperature  at  a  physical  location  in  the  chamber.  In  [6],  an  ensemble  of 
representative  trajectories  is  generated  by  simulating  the  state  equation  using  a 
collection  of  control  inputs  consisting  of  a  nominal  optimal  control  (a  time- varying 
power  setting  for  each  lamp  bank)  and  several  perturbations  via  pseudo-random 
binary  sequences.  It  is  then  shown  that  a  reduced  model  derived  via  Galerkin 
projection  onto  the  span  of  30  POD  basis  vectors  is  sufficient  to  reproduce  tem¬ 
perature  dynamics  to  within  one-third  of  a  degree.  In  [5],  an  ensemble  is  generated 
by  simulations  using  the  nominal  optimal  control  perturbed  by  a  uniform  5%-10%. 
After  truncation  of  all  but  the  most  energetic  40  POD  modes,  further  reductions 
to  a  15  state  system  were  obtained  by  various  procedures,  including  the  “slaving” 
of  modes,  i.e.,  setting  the  time  derivatives  of  a  pre-determined  number  of  “slave” 
modes  to  zero,  resulting  in  a  “steady  manifold”  and  a  set  of  differential-algebraic 
equations. 

Similar  studies  of  model  reduction  for  heat  transfer  in  a  RTP  chamber  with  3 
lamp  banks  are  presented  in  [1,  157].  In  these  studies,  an  ensemble  is  generated 
via  three  simulations:  one  with  each  of  the  lamp  banks  set  to  100%  power  and  the 
other  two  turned  off.  A  reduced  model  with,  respectively,  4  and  5  states,  is  derived 
from  the  original  models  of  order  100  and  76,  respectively,  using  the  POD  and  an 
orthogonal  collocation  discretization  scheme.  The  reduced  model  is  then  used  to 
compute  a  control  for  tracking  a  desired  temperature  trajectory  at  several  points 
on  the  wafer  surface. 

Another  RTP  heat  transfer  application,  for  which  the  order  reduction  is  of 
considerably  greater  magnitude,  is  presented  in  [13].  A  finite-element  discretiza- 
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tion  of  a  generic  RTP  chamber  results  in  a  dynamic  model  with  5060  unknowns. 
There  are  3  operating  points  of  interest,  each  corresponding  to  a  particular  uniform 
steady-state  temperature  across  the  wafer  surface.  Consequently,  the  authors  de¬ 
rive  three  reduced  models  of  order  10,  each  corresponding  to  one  of  the  operating 
points,  and  each  capable  of  reproducing  temperature  trajectories  in  a  vicinity  of 
the  operating  point  to  within  1%  error.  The  overall  strategy  then  involves  switch¬ 
ing  among  the  three  reduced  models  according  to  a  pre-determined  set  of  rules. 
Various  computational  issues  regarding  the  numerical  integration  of  a  switched  set 
of  state  equations  are  addressed. 

Remark  3.2.40  The  ad-hoc  nature  of  the  POD  applied  to  model  reduction  for 
state-space  control  systems  is  made  clear  in  the  examples  presented  above.  The 
discretion  of  the  modeler  in  choosing  the  ensemble  and  designing  the  overall  pro¬ 
cedure  is  evident  in  all  cases.  Moreover,  the  state-to- output  relationship  is  consis¬ 
tently  ignored  in  determining  the  reduced  model.  □ 


3.3  Balanced  Truncation  for  Linear  Systems 

One  technique  that  is  used  often  in  dealing  with  nonlinear  state-space  models  is 
linearization,  in  which  the  model  is  approximated,  locally  about  an  equilibrium, 
by  a  linear  system  derived  from  a  Taylor  series  expansion  of  the  system  map.  We 
then  work  with  the  resulting  linear  model  for  purposes  of,  e.g.,  control.  In  addition, 
there  are  various  system  identification  techniques  that  directly  yield  a  linear  model. 
One  method  for  reducing  the  order  of  a  linear  system  is  balanced  truncation.  It 
is  well  established  and  widely  applied,  mainly  due  to  its  simplicity,  computability, 
and  good  performance. 
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The  balanced  realization  is  one  of  the  infinitely  many  different  state-space  real¬ 
izations  for  a  given  LTI  system.  Its  properties  have  made  it  very  useful  in  control 
engineering  and  signal  processing.  Mullis  and  Roberts  [111]  first  introduced  the 
balanced  realization  in  1976  to  study  roundoff  noise  in  digital  filters.  In  1981, 
Moore  [109]  proposed  the  balanced  truncation  method  for  reducing  the  order  of 
a  stable  finite-dimensional  LTI  system.  The  terminology  “balanced”  reflects  the 
characterization  of  the  realization  as  one  that  is  equally  controllable  and  observ¬ 
able,  a  notion  that  is  made  precise  in  Section  3.3.1.  When  a  system  is  in  balanced 
form,  the  importance  of  an  individual  state  component  to  the  input-to-output  be¬ 
havior  of  the  system  is  indicated  by  the  relative  magnitude  of  its  corresponding 
Hankel  singular  value.  This  provides  for  a  meaningful  ranking  of  state  components 
and  a  guide  for  truncating  those  with  relatively  small  contribution. 

The  method  of  balanced  truncation,  or  balancing,  has  been  extended  in  sev¬ 
eral  directions,  including  to  infinite-dimensional  linear  systems  [34,  55],  unstable 
linear  systems  [104,  122],  closed-loop  linear  systems  incorporating  various  types 
of  controller  structures  (e.g.,  LQG  [71],  [113]),  and  conservative  mechanical 

systems  [158].  To  accommodate  many  practical  applications,  the  method  has  been 
modified  to  incorporate  frequency  weighting  [4,  44],  Glover  [54]  showed  that  the 
balanced  truncation  method  is  not  optimal  with  respect  to  the  Hankel  norm  and 
introduced  a  closely  related  method  that  achieves  optimality.  We  will  not  work 
with  these  extensions  or  modifications  in  this  thesis.  However,  generalizations  of 
the  method  to  the  nonlinear  setting  are  of  paramount  importance  here  and  are 
introduced  in  Section  3.3.5. 

The  material  contained  in  this  section  is  well  known.  Our  exposition  is  mainly 
based  on  the  presentations  in  [54,  109,  141,  170],  and  some  results  drawn  from  [127]. 
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We  refer  to  the  literature  for  the  proofs. 


3.3.1  Derivation 

Consider  the  LTI  system 


x  =  Ax  +  Bu  y  =  Cx  (3.76) 

where  A  is  n  x  n,  B  is  n  x  m,  and  C  is  p  x  n.  The  transfer  function  matrix  for  (3.76) 
is  given  by 

G(s)  =  C(sl-  A)~1B  (3.77) 

We  say  that  {A,  B ,  C)  is  a  realization  for  G(s).  We  say  that  A  is  stable  if  spec  (A)  C 

cr. 

Definition  3.3.1  (Controllability  and  Observability  Gramians)  Consider 
the  realization  ( A ,  B,  C )  and  let  A  be  stable.  The  n  x  n  symmetric  non-negative 
definite  matrices 

Wc  =  J  exp  (At)  B  BJ  exp  tj  dt  (3.78) 

roc  ,  v 

W0  =  J  exp^ATtJ  CT  C  exp  [A  t)  dt  (3.79) 

exist  and  are  called,  respectively,  the  controllability  Gramian  and  the  observability 
Gramian.  □ 

Interpretations  of  the  Gramians  are  important  to  understanding  their  use  in 
model  reduction.  Consider  the  following  interpretations  from  the  energy  point  of 
view.  The  minimum  control  energy  required  to  reach  state  .x0  from  0  in  infinite  time 
is  given  by  xq  Wc~ 1  xq.  Hence,  Wc_1  can  be  used  as  an  indicator  of  the  amount  of 
input  energy  needed  to  reach  a  given  state.  Similarly,  the  output  energy  generated 
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by  releasing  the  system  from  initial  state  Xo  with  the  input  turned  off  is  given  by 
xqW0xq.  Hence,  W0  can  be  used  as  an  indicator  of  the  effect  that  a  given  initial 
state  has  on  the  output. 

Another  important  interpretation  is  what  Moore  referred  to  as  a  signal  injection 
view  of  Kalman’s  minimal  realization  theory.  The  controllability  Gramian  appears 
in  the  PCA  (see  Section  2.3)  of  the  matrix- valued  signal  X(t)  =  exp  (At)  B.  This 
signal  corresponds  to  a  collection  of  state  responses  to  unit  impulses  injected  at 
the  input  terminals.  Similarly,  the  observability  Gramian  appears  in  the  PCA  of 
the  matrix- valued  signal  YJ  (t)  =  exp  (bb'A)  CT .  The  signal  Y(t)  corresponds  to 
a  collection  of  output  responses  to  unit  impulses  injected  as  disturbances  at  the 
output  terminals  of  the  input  filter.  In  his  derivation,  Moore  used  the  PCA  of  these 
signals  to  characterize  the  controllable  subspace  and  the  orthogonal  complement 
of  the  unobservable  subspace,  and  to  find  a  coordinate  system  in  which  those 
subspaces  are  spanned  by  the  same  PCA  component  vectors.  This  point  of  view 
illustrates  the  connections  between  the  POD,  PCA,  and  balanced  realizations  for 
linear  systems. 


Example  3.3.2  The  controllability  or  observability  Gramian  alone  cannot  give 
an  accurate  indication  of  the  dominance  of  the  state  components  pertaining  to 
the  input-to- output  behavior.  Consider  the  following  example  from  [170].  For  the 
transfer  matrix 


G(s) 


3s  +  18 
s2  +  3  s  +  18 


(3.80) 


we  have  the  family  of  realizations 


—1  —A/a 

1 

r  7 

B  = 

C  = 

-1  2/a 

(3,81) 

4  a  -2 

2  a 

L  J 

97 


with  Gramians 


Wr  = 


1/2  0 

0  a2 


WQ  = 


1/2  0 

0  1/a2 


(3.82) 


We  see  that  the  degree  to  which  the  second  state  component  is  controllable  or  ob¬ 
servable  can  be  made  arbitrarily  high  or  low,  but  not  independently.  □ 


The  following  properties  of  the  Gramians  are  important  for  our  purposes. 

Theorem  3.3.3  (Lyapunov  Equations)  Consider  the  realization  ( A ,  B ,  C)  and 
let  A  be  stable.  The  controllability  Gramian  Wc  and  the  observability  Gramian  WQ 
are  the  unique  solutions  of  the  matrix  Lyapunov  equations,  respectively, 


AWC  +  WCAT  +  B  BJ  =  0 

(3.83) 

ATW0  +  W0A  +  CTC  =  0 

(3.84) 

□ 

Theorem  3.3.4  Consider  the  realization  ( A ,  B,  C )  and  let  A  be  stable.  The  con¬ 
trollability  Gramian  Wc  is  positive  definite  if  and  only  if  (A,  B)  is  controllable. 
Likewise,  the  observability  Gramian  WQ  is  positive  definite  if  and  only  if  ( C ,  A)  is 
observable.  □ 

A  meaningful  ranking  of  states  is  provided  by  the  Hankel  singular  values. 

Definition  3.3.5  (Hankel  Singular  Values)  Consider  the  realization  (A,  B,C) 
with  A  stable,  controllability  Gramian  Wc,  and  observability  Gramian  WQ.  The 
Hankel  singular  values  of  the  system  are  defined  as  the  positive  square  roots  of  the 
eigenvalues  of  the  product  WCW0,  i.e., 

Vi  =  (MWcW0))1/2  ien  (3.85) 

where  by  convention  they  are  ordered  such  that  <7*  >  eri+ 1 .  □ 
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Remark  3.3.6  Under  similarity  transformation  x  =  S  x,  the  Gramians  transform 
to,  respectively, 

Wc  =  S -1  Wc  (s'7) W0  =  STW0S  (3.86) 

and  the  product  WCW0  transforms  to  S~lWcWQS.  Thus,  the  Hankel  singular 
values  are  invariant  under  similarity  transformation.  □ 

Remark  3.3.7  If  the  realization  ( A ,  B ,  C )  is  minimal  then  the  Hankel  singular 
values  are  strictly  positive.  We  will  work  only  with  minimal  realizations.  □ 

Fact  3.3.8  Let  G(s )  be  the  transfer  function  matrix  for  the  minimal  realization 
(A,  B,  C )  with  A  stable.  The  largest  Hankel  singular  value  of  the  system  is  equal 
to  the  Hankel  norm  of  the  system,  i.e., 

\\G\\2h  =  <tI  (3-87) 

The  other  Hankel  singular  values  may  be  characterized  inductively  in  a  similar 
way.  □ 

Example  3.3.9  The  Hankel  singular  values  in  Example  3.3.2  are  1  and  0.5.  □ 

We  now  define  the  balanced  realization  in  terms  of  the  Gramians  and  note  its 
relationship  with  the  Hankel  singular  values. 

Definition  3.3.10  (Balanced  Realization)  A  minimal  realization  (. A,B,C ) 
with  A  stable,  controllability  Gramian  Wc,  and  observability  Gramian  WQ  is  said 
to  be  balanced  if 

Wc  =  E  =  W0  (3.88) 

where  E  =  diag  (<Ti, . . . ,  an )  >0.  □ 
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Remark  3.3.11  Such  a  realization  is  also  referred  to  as  internally  balanced  [109] 
and  diagonal  balanced  [66].  □ 

Remark  3.3.12  The  diagonal  entries  in  E  correspond  to  the  Hankel  singular  val¬ 
ues  of  the  system.  □ 

Remark  3.3.13  The  origin  of  the  terminology  “ balanced ”  is  apparent,  i.e.,  the 
system  is  equally  controllable  and  observable,  as  indicated  by  the  equality  of  the 
Gramians.  □ 

Theorem  3.3.14  (Existence  and  Uniqueness  of  a  Balanced  Realization) 

Let  the  realization  ( A ,  B,C)  be  minimal  and  let  A  be  stable.  Then  there  exists  a 
coordinate  transformation  x  =  S'bai  x  such  that 

Wc  =  S'bai-1  Wc(5balT)"1  =  E  =  Sba/fkoSbal  =  Wa  (3.89) 

where  E  =  diag(oi, . . . ,  an)  >0.  It  is  unique  up  to  an  arbitrary  orthogonal  trans¬ 
formation  T  such  that  TE  =  ET.  □ 

Remark  3.3.15  We  refer  to  S'bai  as  the  balancing  coordinate  transformation.  □ 

There  are  two  other  special  forms  that  are  related  to  the  balanced  realization 
and  that  will  be  important  for  computing  balanced  realizations  in  the  nonlinear 
setting. 

Definition  3.3.16  (Input-Normal,  Output-Normal)  A  minimal  realization 
(. A ,  B,  C )  with  A  stable,  controllability  Gramian  Wc,  and  observability  Gramian 
WQ  is  said  to  be  input-normal  if 

Wc  =  1  W0  —  E2  (3.90) 
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where  S  is  the  diagonal  matrix  of  Hankel  singular  values.  Furthermore,  it  is  said 
to  be  output-normal  if 

Wc  =  E2  W0=l  (3.91) 

□ 

Remark  3.3.17  The  input-normal  and  output-normal  realizations  are  easily  ob¬ 
tained  from  the  balanced  realization  by  scalings  on  the  states,  respectively,  x  = 
S1/2  x  and  x  =  E-1/2  x.  □ 

3.3.2  Properties 

We  now  justify  the  use  of  the  balanced  realization  as  a  tool  for  model  reduction. 
Let  A  be  stable  and  ( A ,  B,C )  be  a  minimal  realization  in  balanced  form  so  that 
Wc  =  E  =  W0.  Then  the  magnitude  of  the  Hankel  singular  value  cq,  relative  to 
the  others,  is  an  indication  of  the  degree  to  which  the  z-th  state  component  is, 
simultaneously,  controllable  and  observable,  relative  to  the  others.  Equivalently, 
a  small  a,  means  that  it  is  relatively  difficult  both  to  reach  and  to  observe  the 
state  (0, . . . ,  0,  Xi,  0, . . . ,  0),  and  visa- versa.  Finally,  from  an  energy  point  of  view, 
<Ji  is  interpreted  as  indicating  the  contribution  of  the  z-t  h  state  component  to  the 
input-to-output  energy  gain  of  the  system,  as  measured  by  the  Hankel  norm. 

Suppose  that  ak  W  <rfc+1  for  some  k.  Then  those  states  corresponding  to 
ak+ 1, . . . ,  an  are  considerably  less  important  than  states  corresponding  to  <j\, . . . ,  ak. 
Consequently,  truncation  of  the  less  important  states  causes  little  degradation  to 
the  predictive  capability  of  the  model,  as  pertaining  to  the  input-to-output  behav¬ 
ior.  These  ideas  are  made  more  precise  in  the  following  discussion. 

We  derive  the  reduced-order  model  via  partitioning  and  truncation.  Consider 
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the  partition  of  the  balanced  Gramian 


£  = 


where  £1  =  diag  (<Ti, . . . ,  ak)  and  £2 
tion  (A,  B ,  C )  accordingly  as 


Si 

1 

0 

1 - 

0 

S2 

(3.92) 


=  diag  (crfc+1, . . . ,  an).  In  addition,  we  parti- 


A  = 


1 - 

> 

to 

_ 1 

<M 

- 1 

1 

B  = 


Bl 


Bo 


C  = 


C'i 


G2 


(3.93) 


We  refer  to  the  subsystem  (An,  B1,  C\)  as  the  truncated  system.  It  can  be  used  as 
a  A:- th- order  reduced  model  to  approximate  the  n-tli  order  full  model  (A,  B.  C ). 
The  following  results  say  that  the  truncated  system  is  balanced  and  stable. 


Theorem  3.3.18  (Pernebo  and  Silverman  [127])  Let  (A,B,C)  be  balanced 
with  Gramian  £  and  partitions  (3.92)  and  (3.93).  Then  both  subsystems 
(An,  -Eh,  C'i)  and  (A22,  B2,  C2)  are  balanced  and  their  controllability  and  observ¬ 
ability  Gramians  are  equal  to,  respectively,  £1  and  £2.  □ 


Theorem  3.3.19  (Pernebo  and  Silverman  [127])  Let  ( A,B,C )  be  balanced 
with  Gramian  £  and  partitions  (3.92)  and  (3.93).  If  ak  >  Cfc+i  (i.e.,  crk  7^  &k+i) 
then  both  subsystems  (An,  £>1,  Cd)  and  (A22,  B2,C2)  are  asymptotically  stable.  □ 


Remark  3.3.20  Truncation  amounts  to  setting 

xk+ 1  =  •  •  •  =  xn  =  0  (3.94) 

There  is  another  method  [f6]  for  generating  a  reduced  model  in  which  we  set 

xk+1  =  ■  ■  ■  =  xn  =  0  (3.95) 


102 


Using  the  latter  method,  the  full-order  and  reduced-order  models  have  matching 
DC  gains  (steady-state  response).  The  former  method  typically  produces  a  better 
approximation  to  the  full-order  model  over  a  range  of  frequencies,  but  the  DC  gains 
are  not  guaranteed  to  match.  Using  either  method,  the  reduced  model  is  stable  and 
balanced.  □ 

The  performance  of  balanced  truncation  as  a  model  reduction  method  is  char¬ 
acterized  by  the  following  error  bound. 

Theorem  3.3.21  (Glover  [54])  Let  ( A ,  B,  C )  be  balanced  with  Gramian  E,  par¬ 
titions  (3.92)  and  (3.93),  and  transfer  function  matrix  G{s).  Let  the  truncated 
system  {An,  Bi,  Cf)  have  transfer  function  matrix  Gi{s).  Then 

n 

\\G-G1\\H<\\G-G1\\00<2  Y,  (3.96) 

i=k+ 1 

□ 

Remark  3.3.22  Thus,  if  <Jk+ i,  • . . ,  an  are  small  then  the  error  is  small  and  the 
truncated  system  is  a  good  approximation  in  terms  of  the  Hankel  norm,  i.e.,  the 
error  bound  can  be  used  as  a  measure  of  model  fidelity  corresponding  to  (1.1).  □ 

Remark  3.3.23  The  upper  bound  for  the  error  (3.96)  is  not  optimal,  but  is  close 
to  optimal.  See  Glover  [54]  for  a  characterization  of  all  optimal  Hankel  norm 
approximations  to  {A,  B ,  C )  and  associated  error  bounds.  □ 

3.3.3  Computation 

Given  a  minimal  realization  (A,  B .  C )  with  A  stable,  a  balanced  realization  can  be 
obtained  efficiently  through  the  standard  algorithms  of  Laub  [90],  Moore  [109],  and 
the  more  elegant  algorithm  of  Laub,  et.al.  [91].  The  latter  algorithm  is  currently, 
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and  has  been  since  1994,  the  standard  balancing  algorithm  used  in  MATLAB  [102], 
These  algorithms  use  standard,  efficient,  and  stable  matrix  algebra  and  decompo¬ 
sition  routines  such  as  for  SVD  and  Cholesky  decomposition.  There  is  little  or  no 
trouble  in  computing  balanced  realizations  for  LTI  systems  in  most  situations. 

Numerical  difficulties  arise  in  using  these  algorithms  when  the  condition  number 
of  the  product  matrix  Wc  WQ  is  high.  This  corresponds  to  the  situation  where 
some  states  are  nearly  uncontrollable  or  nearly  unobservable,  i.e.,  the  realization  is 
nearly  non-minimal.  Safonov  and  Chiang  [136]  introduced  a  method  to  deal  with 
this  situation,  which  they  called  a  “Schur  method”  in  reference  to  the  fact  that  it 
is  computed  using  the  Schur  decomposition  of  the  product  matrix  Wc  W0. 

The  idea  is  to  avoid  balancing  altogether  by  computing  orthonormal  bases, 
via  the  ordered  Schur  form,  for  the  left  and  right  eigenspaces  associated  with 
the  “big”  eigenvalues  of  Wc  W0.  The  reduced  k- th  order  model  is  not  necessarily 
balanced,  but  has  exactly  the  same  transfer  function  as  would  any  k- th  order 
balanced  realization.  Thus,  the  reduced  unbalanced  model  enjoys  the  same  error 
bound  as  a  reduced  balanced  model.  The  algorithm  is  stable  and  effective  without 
regard  to  nearness  to  unobservability  or  uncontrollability. 

Finally,  we  note  that  Helmke  and  Moore  [66]  have  presented  gradient  flows 
on  the  class  of  positive  definite  matrices  that  converge  to  the  unique  symmetric 
positive  definite  balancing  transformation  matrix,  as  well  as  to  all  balancing  trans¬ 
formation  matrices  for  a  given  realization  (. A,B,C ).  The  gradient  flows  converge 
exponentially  fast  to  the  balanced  realization.  The  gradient  flow  method  has  not 
yet  become  prominent  for  applications  due  to  issues  of  practical  implementation. 

The  details  of  the  various  algorithms  discussed  here  are  presented  in  Ap¬ 
pendix  E. 
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3.3.4  Applications 

Balanced  truncation  has  become  prominent  as  a  tool  for  model  reduction  during  the 
1980s  and  1990s  mainly  due  to  its  simplicity,  computability,  and  good  performance. 
It  has  been  applied  in  a  wide  variety  of  areas,  including  the  several  that  we  briefly 
discuss  here.  We  note  that  in  these  and  most  practical  situations,  the  balancing 
procedure  is  modified  somewhat  to  suit  the  intended  application.  However,  the 
role  which  ad-hoc  techniques  play  in  balanced  model  reduction  is  small  compared 
with  that  for  POD  methods. 

One  drawback  of  balancing  is  that  the  state  variables  in  the  balanced  realiza¬ 
tion  lose  their  physical  meaning.  Blelloch,  Mingori,  and  Wei  [17]  addressed  this 
problem  in  their  application  of  balanced  truncation  to  linear  models  for  lightly 
damped  mechanical  systems  with  gyroscopic  and  small  circulatory  forces  such  as 
large  flexible  space  structures.  Specifically,  they  used  the  result  that  the  modal  rep¬ 
resentation  for  certain  systems  becomes  asymptotically  balanced  as  the  damping 
approaches  zero.  Thus,  a  lightly  damped  structure  in  modal  form,  which  retains 
the  physical  meaning  of  the  state  variables,  is  approximately  balanced.  After  de¬ 
riving  an  approximate  balanced  realization  for  the  model  of  a  dual-spin  spacecraft, 
they  take  advantage  of  its  properties  to  reduce  the  model  order. 

Friswell,  Penny,  and  Garvey  [50]  conducted  a  comparative  study  of  reduction 
methods  for  models  of  high  degree-of-freedom  mechanical  structures  with  local 
nonlinearities.  The  nonlinear  term  (nonlinear  forces)  was  ignored  in  computing  the 
linear  coordinate  transformations,  which  were  then  applied  to  the  full  nonlinear 
model.  The  authors  applied  the  reduction  methods  to  models  of  a  cantilever  beam 
system  and  a  pinned  beam  system.  For  the  simulations  they  conducted,  reduced- 
order  models  derived  via  balanced  truncation  predicted  the  response  of  the  full- 
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order  nonlinear  model  more  accurately  than  the  other  methods,  including  modal 
coordinates. 

Ramirez  and  Maciejowski  [130]  directly  formulated  a  balanced  model  for  a 
stirred  tank  chemical  reactor  from  pulse  response  data.  They  used  the  canonical 
form  of  Ober  [122,  123]  and  a  related  system  identification  algorithm  to  produce 
a  high-order  balanced  realization.  Truncation  yielded  a  3-state  linear  model.  The 
input-to-output  response  of  the  low-order  balanced  model  was  compared  with  that 
of  a  3-state  nonlinear  physical- chemical  model  and  its  linearization.  For  the  simu¬ 
lations  they  conducted,  the  balanced  model  captured  the  nonlinear  dynamics  of  the 
system  more  accurately  than  did  the  linearization.  It  was  then  used  in  the  design 
of  an  LQG  optimal  control  law  and  Kalman  filter  for  regulation  in  the  presence  of 
large  disturbances. 

Finally,  we  note  that  balanced  truncation  has  been  studied  [116,  120],  and  used 
in  at  least  one  commercial  software  package  [129],  for  order  reduction  of  nonlinear 
models  for  heat  transfer  in  RTP  chambers.  In  these  applications,  linearized  versions 
of  the  models  are  used  to  compute  the  balancing  coordinate  transformation,  which 
is  then  applied  to  the  original  nonlinear  model.  Truncation  yields  low-order  models 
for  use  in  simulation,  control,  and  optimization. 

3.3.5  Nonlinear  Generalizations 

There  have  been  two  recent  independent  attempts  (of  which  we  are  aware)  to 
generalize  the  method  of  balanced  truncation  to  the  nonlinear  setting.  In  1993, 
Scherpen  [140,  141]  introduced  a  general  theory  and  procedure  of  balancing  for  a 
class  of  stable,  affine,  smooth,  finite-dimensional  nonlinear  systems.  The  approach 
is  inherently  nonlinear;  it  produces  a  nonlinear  balancing  coordinate  transforma- 
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tion  that  is  local  to  a  neighborhood  of  the  origin.  In  1999,  Lall,  Marsden,  and 
Glavaski  [137]  introduced  a  method  that  is  inherently  linear;  it  produces  a  linear 
change  of  coordinates  by  constructing  and  decomposing  matrices  that  serve  as  gen¬ 
eralized  Gramians  for  the  nonlinear  system.  We  introduce  and  remark  on  these 
methods  below  to  provide  background  for  Chapter  4. 

Scherpen  Nonlinear  Balancing 

The  Scherpen  methodology  departs  from  the  signal  injection  viewpoint  of  Moore 
while  remaining  consistent  with  its  results  in  the  LTI  case.  The  main  objects 
of  importance  are  the  controllability  and  observability  energy  functions.  These 
functions  serve  the  role  that  the  controllability  and  observability  Gramian  matrices 
do  in  linear  balancing,  i.e.,  they  provide  a  well-defined  measure  of  the  degree  to 
which  a  system  is,  respectively,  controllable  or  observable.  However,  unlike  the 
Gramians,  they  are  not  easily  computable. 

In  the  LTI  case,  the  energy  functions  specialize  to  quadratic  forms  involving 
the  Gramian  matrices.  Thus,  it  is  natural  that,  in  the  nonlinear  setting,  the  first 
step  in  the  balancing  procedure  is  to  determine  a  nonlinear  change  of  coordinates 
in  a  neighborhood  of  the  origin  under  which  the  controllability  function  is  lo¬ 
cally  quadratic.  This  is  accomplished  by  application  of  the  Morse-Palais  lemma, 
which  gives  a  quadratic  canonical  form  for  functions  in  the  neighborhood  of  a  non¬ 
degenerate  critical  point.  Again,  there  exist  no  practical  methods  for  computing 
the  desired  change  of  coordinates. 

Further  nonlinear  coordinate  transformations  take  the  system  to  special  forms 
that  are  analogous  to  input-normal,  output-normal,  and  balanced.  When  the 
system  is  in  balanced  form,  state  components  can  be  ranked  and  deleted  according 
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to  their  respective  contributions  to  the  input-to-output  energy  of  the  system,  as 
indicated  by  the  respective  relative  magnitudes  of  the  singular  value  functions, 
which  are  generalizations  of  the  Hankel  singular  values. 

Remark  3.3.24  The  method  is  consistent  with  the  LTI  balancing  procedure  in 
the  following  sense.  Suppose  the  nonlinear  system  is  realized  with  (/,  g,  h)  which 
has  linearization  (A,B,C)  about  0.  Let  T  be  the  nonlinear  balancing  transfor¬ 
mation  for  (/,  g,  h )  about  0  and  (f,  g,  h^j  be  the  balanced  realization.  Let  S  and 
(^A,B,C^j  be  the  linearizations,  respectively,  of  T  and  {J,g,h^j  about  0.  Let  T  be 
the  balancing  transformation  matrix  for  ( A,B,C ).  Then  S  —  T  and  (. A,B,C\  = 
(T-1  A  T,  T~l  B,CT).  □ 

Remark  3.3.25  In  contrast  to  the  linear  case,  the  nonlinear  balancing  procedure 
is  not  immediately  amenable  to  computational  implementation.  For  example,  the 
controllability  energy  function  corresponds  to  the  value  function  for  a  nonlinear 
optimal  control  problem.  Also,  the  Morse-Palais  lemma  guarantees  the  existence 
of  a  transformation  to  the  desired  canonical  form,  but  provides  no  constructive 
procedure  for  obtaining  it.  Thus,  tools  have  not  yet  appeared  for  computing  balanced 
realizations  for  nonlinear  systems,  and  the  Scherpen  procedure  has  not  yet  been 
applied  as  a  tool  for  model  reduction.  □ 

Scherpen  Pseudo-Balancing 

We  note  that,  in  1994,  Scherpen  [141]  also  introduced  a  method  for  balancing  in  the 
special  case  of  a  nonlinear  Hamiltonian  system.  A  special  technique  is  necessary 
because  such  a  system  is  conservative  and  therefore  not  asymptotically  stable.  The 
method  is  a  nonlinear  generalization  of  the  pseudo-balancing  approach  of  van  der 
Schaft  and  Oeloff  [158].  The  idea  is  to  derive  an  associated  “gradient  system” 


108 


of  dimension  n,  from  the  original  Hamiltonian  system  of  dimension  2 n,  that  is 
asymptotically  stable  and  thus  can  be  balanced. 

Remark  3.3.26  To  date,  the  only  “balanced”  realizations  for  nonlinear  systems 
that  have  appeared  in  the  literature  have  been  derived  via  pseudo-balancing  [lfl, 
159],  i.e.,  are  pseudo-balanced  Hamiltonian  systems.  □ 

LMG  Nonlinear  “Balancing” 

The  method  of  Lall,  Marsden,  and  Glavaski  (LMG)  adopts  the  signal  injection 
viewpoint  of  Moore  and  extends  it  by  expanding  the  class  of  allowable  impulsive 
test  signals  to  include  rigid  rotations  and  uniform  scalings  of  the  original  impulsive 
vector  signals.  In  a  procedure  analogous  to  that  of  Moore,  application  of  PGA  to 
the  resulting  collection  of  system  response  signals  produces  matrices,  called  the 
empirical  controllability  Gramian  and  the  empirical  observability  Gramian,  that 
serve  the  role  that  their  respective  counterparts  do  in  the  LTI  case.  Thus,  again 
like  that  of  Moore,  the  approach  is  strongly  related  to  the  POD. 

Remark  3.3.27  As  with  the  Scherpen  method,  the  LMG  method  is  consistent  with 
the  LTI  balancing  procedure.  It  is  shown  that  the  use  of  the  expanded  input  signal 
space  has  no  effect  when  applied  in  the  LTI  case,  i.e.,  the  LMG  method  specializes 
to  the  usual  balanced  truncation  for  LTI  systems.  □ 

Remark  3.3.28  The  computational  framework  of  the  LMG  method  is  the  same 
as  that  of  Moore,  i.e.,  it  involves  matrix  algebra  and  decompositions.  Therefore,  it 
retains  the  computational  ease  and  efficiency  of  LTI  balancing.  □ 

Remark  3.3.29  The  nomenclature  “empirical  Gramian ”  used  to  distinguish  the 
new  objects  from  the  familiar  Gramians  is  misleading,  since  the  empirical  Gramians 


109 


are  no  more  nor  less  “empirical”  than  the  familiar  Gramians.  This  is  made  clear 
by  Moore’s  signal  injection  viewpoint.  The  new  objects  are  better  described  as 
extended  Gramians,  to  indicate  that  they  are  constructed  by  taking  account  of  the 
broader  class  of  impulsive  test  signals.  Moreover,  the  properties  of  these  Gramians 
pertaining  to  controllability  and  observability  in  the  nonlinear  setting  are  unclear 
and  not  discussed.  □ 

The  authors  claim  that  the  method  provides  a  balanced  truncation  via  ranking 
of  subspaces,  but  the  authors  never  actually  define  what  they  mean  by  a  “balanced 
nonlinear  system.”  The  proposed  change  of  coordinates  balances  the  empirical 
Gramians,  i.e.,  makes  them  equal  to  the  same  positive-definite  diagonal  matrix. 
While  the  consequences  of  such  a  change  of  coordinates  are  known  in  the  LTI 
case,  the  implications  on  the  nonlinear  system  realization  are  not  known  and  never 
discussed.  Moreover,  no  quantification  of  the  importance  of  a  particular  subspace 
is  offered.  Thus,  it  is  unclear  as  to  how  exactly  the  LMG  method  is  an  extension 
of  balancing  to  nonlinear  systems. 

The  LMG  method  does  provide  an  organized  framework  for  injecting  test  sig¬ 
nals  with  additional  degrees  of  freedom  beyond  that  which  Moore  described,  while 
remaining  compatible  with  the  linear  theory,  and  reinforcing  the  importance  and 
special  properties  of  impulsive  signals  as  inputs.  In  particular,  the  rigid  rotations 
and  scalings  of  the  impulsive  inputs  are  chosen  such  that  they  excite  the  nonlinear 
system  in  some  appropriate  manner.  The  authors  suggest  that  experience  and  ex¬ 
perimental  data  may  be  useful  in  choosing  parameters.  Thus,  the  method  suffers 
from  some  of  the  shortcomings  associated  with  the  POD,  albeit  with  the  choice  of 
input  signals  parameterized  in  an  organized  fashion. 
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Remark  3.3.30  Generalizations  of  the  Gramian  matrices  to  the  nonlinear  setting 
that  are  compatible  with  the  Scherpen  theory  for  nonlinear  balancing  have  recently 
appeared  [59].  These  generalizations  have  the  advantage  that  their  properties  per¬ 
taining  to  controllability  and  observability  are  well-defined.  □ 

3.4  Component  Truncation 

Recall  the  general  methodology  for  state-space  model  reduction  outlined  in  Sec¬ 
tion  1.1.1  and  illustrated  by  Figure  1.1.  Once  a  coordinate  transformation  has 
been  selected,  it  is  applied  to  the  model  equations  to  yield  the  transformed  sys¬ 
tem.  Then,  state  components  are  ranked,  and  some  are  deleted.  We  will  refer 
to  this  process  as  component  truncation.  We  have  already  presented  component 
truncation  procedures  for  the  special  cases  of 

•  a  linear  orthogonal  coordinate  transformation  $  for  a  nonlinear  model  (POD 

-  via  projection  $fc);  and 

•  a  linear  coordinate  transformation  S'bai  for  a  LTI  model  (balanced  truncation 

-  via  partition  and  truncation). 

Here  we  present  the  general  procedure  and  make  some  remarks  about  practical 
computation. 

Let  the  full-order  state-space  model  be  given  by  the  autonomous  state  equation 
and  output  equation,  respectively, 

x  =  f  (x,u)  (3.97) 

y  =  h(x)  (3.98) 
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where  x  G  !Rn  represents  local  coordinates  for  the  full-order  state.  Suppose  that 
we  apply  the  diffeomorphic  change  of  coordinates 


z  i— >  x  =  S(z) 


(3.99) 


under  which  the  system  map  and  output  map  transform  to  (see  Section  2.1),  re¬ 
spectively, 


f(z,u )  =  [DS(z)]  1f(S(z),u) 
h(z)  =  h(S(z)) 


(3.100) 

(3.101) 


We  reduce  the  system  order,  or  truncate  state  components,  by  setting 

zk+i  =  ■  ■  ■  =  zn  =  0  (3.102) 

so  that  fk+i,  ■  ■  ■  ■  fn  are  no  longer  relevant  to  the  model.  We  define  the  truncated 
state 

z1  =  [z1,...,zk]T  (3.103) 

and  the  truncated  system  map  and  output  map,  respectively, 
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(3.104) 

The  reduced-order  model  and  approximate  state  reconstruction  are  given  by 


and 


i1  = 

f 1  (z1,™) 

(3.105) 

y  = 

h 1  (z1) 

(3.106) 

x  =  S(zi, . 

■  ■  ,zk,0,  ■  ■  •  ,0) 

(3.107) 
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The  reduction  in  dimensionality  of  the  state-space  manifold  from  n  to  k  is 
not  necessarily  reflected  in  practical  computations.  Suppose  that  we  are  given 
(/,  g,  h )  and  compute  a  suitable  local  transformation  S.  Consider  a  numerical 
integration  of  the  reduced  state  equation  (3.105).  This  computation  requires,  for 
each  time  step,  evaluation  of  the  k  real- valued  functions  in  the  reduced  system  map 
/ 1  =  (jh  . . . ,  fk V  However,  practically  speaking,  these  function  evaluations  must 
be  performed  using  the  known  original  /  and  known  diffeomorphism  S  via  (3.100). 
Thus,  it  is  actually  necessary  to  evaluate  the  n  real-valued  functions  in  the  full 
state  map  /  =  (/1; . . . ,  fn ),  partially  defeating  the  purpose  of  the  reduction.  This 
dilemma  has  been  alluded  to  in  the  applied  POD  literature  [5,  6]  without  further 
explanation. 

Remark  3.4.1  It  appears  desirable  to  have  a  computational  procedure  in  which 
the  transformed  system  map  f  can  be  evaluated  without  needing  to  evaluate  the  full 
system  map  f .  □ 

Remark  3.4.2  This  issue  does  not  enter  into  the  LTI  case.  The  linear  structure 
allows  for  the  elimination  of  a  subsystem  that  is  completely  irrelevant  to  computa¬ 
tions  for  the  reduced  model.  □ 

3.5  Remarks 

We  have  studied  two  prominent  methodologies  for  model  reduction  of  linear  and 
nonlinear  systems.  The  POD  approach  is  applicable  to  linear  and  nonlinear  mod¬ 
els,  and  to  models  of  finite  and  infinite  dimension.  It  produces  a  set  of  basis  vectors 
for  a  linear  orthogonal  change  of  coordinates.  The  basis  vectors  are  called  empiri¬ 
cal  eigenvectors,  and  in  certain  contexts  they  correspond  to  physically  manifested 


113 


spatial  structures  in  a  spatially  distributed  flow.  The  procedure  can  be  character¬ 
ized  as  statistical  and  empirical  in  nature,  in  that  the  basis  is  derived  via  spectral 
decomposition  of  the  covariance  associated  with  an  ensemble  of  empirically  gen¬ 
erated  signals.  Its  effectiveness  relies  on  the  ability  of  the  ensemble  to  capture 
the  essential  system  behavior.  The  POD  coordinate  transformation  is  not  derived 
directly  from  the  model  or  its  intrinsic  properties.  Furthermore,  it  completely 
ignores  the  state-to-output  relationship. 

The  empirical  eigenvectors  are  computed  easily  using  SVD.  The  corresponding 
eigenvalues  provide  a  meaningful  ranking  of  state  components  in  terms  of  sig¬ 
nal  energy  as  captured  by  the  ensemble.  However,  there  exist  no  explicit  error 
bounds  in  terms  of  the  eigenvalues  or  otherwise.  Coordinate  transformation  and 
component  truncation  occur  simultaneously  via  linear  projection  onto  the  low¬ 
dimensional  subspace.  Versions  of  the  method  have  been  applied  within  overall 
ad-hoc  procedures  for  particular  situations. 

The  balancing  approach  produces  a  linear  change  of  coordinates  for  an  LTI 
system.  The  balanced  realization  is  derived  directly  from  the  model  parameters 
{A,  £>,  C)  through  decompositions  of  the  Gramians  Wc  and  Wa.  The  procedure  can 
be  characterized  as  control-theoretic  in  nature,  in  that  it  derives  from  controllabil¬ 
ity  and  observability  properties  of  the  system,  although  the  signal  injection  view 
of  Moore  reveals  connections  to  the  POD. 

The  balancing  transformation  is  easily  computable  using  efficient  matrix  alge¬ 
bra  algorithms.  The  Hankel  singular  values  provide  a  well-defined  and  meaningful 
ranking  of  state  components  in  terms  of  contribution  to  the  input-to-output  be¬ 
havior.  Furthermore,  the  norm  of  the  error  between  the  full  and  reduced  models  is 
explicitly  computed  in  terms  of  the  discarded  Hankel  singular  values.  Component 
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truncation  is  performed  via  partition  and  subsystem  elimination.  The  method  has 
been  applied  to  linear  models  for  various  physical  systems. 

The  theory  and  procedure  of  Scherpen  generalizes  the  established  linear  method 
to  the  nonlinear  setting.  The  balancing  transformation  is  nonlinear  and  local  to  a 
neighborhood  of  the  origin.  It  retains  some  of  the  appealing  features  of  the  linear 
method,  e.g.,  the  balancing  transformation  is  derived  directly  from  the  model 
parameters  /,  g,  and  h.  and  emphasizes  state  components  that  are  both  strongly 
controllable  and  strongly  observable,  so  that  state  components  which  are  least 
likely  to  influence  the  measurements  are  truncated.  However,  the  procedure  is  not 
easily  computable  and  its  performance  has  not  yet  been  observed  in  applications. 
We  address  these  issues  in  Chapter  4. 
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Chapter  4 


Computing  Balanced  Realizations 
for  Nonlinear  Systems 

4.1  Introduction 

This  chapter  addresses  the  problem  of  computability  pertaining  to  the  Scherpen 
theory  and  procedure  for  balancing  of  nonlinear  systems.  We  offer  methods  and  al¬ 
gorithms  toward  computing  balanced  realizations  for  stable  affine  nonlinear  control 


systems,  i.e.,  state-space  models  of  the  form 

m 

x(t)  =  f  (x(t))  +  J2  gi(x(t))  Ui(t)  (4.1) 

i= 1 

y(t )  =  h(x(t))  (4.2) 

where  u  =  (iti, . . . ,  um)  G  U  C  Rm,  y  =  (yi,...,yp)  G  Rp,  and  x  =  (x\,...,xn) 
are  local  coordinates  for  a  smooth  state-space  manifold  M.  The  maps  f .  (j\. ... .  grn 
are  smooth  and  we  assume  that  /( 0)  =  0  and  h( 0)  =  0. 


We  say  that  /  is  stable  (asymptotically  stable)  if  0  is  a  stable  (asymptotically 
stable)  equilibrium  for  x  =  f(x),  and  normally  assume  asymptotic  stability  of  /. 
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We  refer  to  the  triple  (/,  g,  h )  as  a  realization  of  the  nonlinear  system. 

In  Sections  4.2  and  4.3  we  consider  the  problem  of  computing  the  controllability 
energy  function  without  solving  the  family  of  optimal  control  problems  implied  in 
its  definition.  Stochastically  excited  systems  (see  Section  2.6)  play  a  major  role 
in  our  methodology.  We  present  a  stochastic  method  for  computing  an  estimate 
of  the  controllability  function,  and  show  that  in  certain  situations  the  method 
provides  an  exact  solution.  The  procedure  is  tested  on  applications  via  Monte- 
Carlo  experiments. 

The  crucial  step  in  the  balancing  procedure  is  finding  a  local  coordinate  trans¬ 
formation  under  which  the  controllability  function  is  quadratic  in  a  neighborhood 
of  0.  The  Morse-Palais  lemma  ensures  the  existence  of  a  quadratic  canonical  form 
for  a  function  with  a  non-degenerate  critical  point  at  0,  such  as  the  controllability 
and  observability  energy  functions.  However,  it  provides  no  constructive  procedure 
for  obtaining  the  desired  local  change  of  coordinates.  In  Section  4.4  we  develop  an 
algorithm  for  computing  the  desired  nonlinear  transformation. 

In  Section  4.5  we  present  the  overall  procedure  for  computing  the  balancing 
transformation  and  algorithms  for  performing  the  required  computations.  We  ap¬ 
ply  the  methods  developed  in  this  chapter  to  two  example  problems.  The  results 
are  presented  in  Section  4.6.  We  compute  a  balanced  realization  for  a  forced 
damped  pendulum  system,  and  make  progress  toward  balancing  a  forced  damped 
double  pendulum  system.  Additional  remarks  and  a  summary  are  presented  in 
Section  4.7. 
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4.2  Energy  Functions 


As  stated  earlier,  a  balanced  realization  is  one  that  is  equally  controllable  and 
observable.  In  order  to  make  such  a  statement  meaningful,  we  must  provide  a 
measure  of  the  degree  to  which  a  system  realization,  and  its  state  components,  are, 
respectively,  controllable  and  observable.  These  properties  can  be  quantified  in  a 
precise  way  via,  respectively,  the  controllability  and  observability  energy  functions 
of  the  system. 


Definition  4.2.1  (Controllability  Function)  The  controllability  function, 
Lc  :  R"  — »  R,  for  system  (Jhl)-(Jh2)  is  defined  by 


Lc  (x0)  = 


min 

u  G  £2(— oo,  0) 


x(— oo)  =  0  ,  x(0)  =  x0 


(4.3) 


□ 


Definition  4.2.2  (Observability  Function)  The  observability  function, 

La  :  Rn  — »  R,  for  system  (Jhl )-(Jh2)  is  defined  by 

La{x 0)  =  -  /  \\y(t)\\  dt  x(0)  =  x0  u(t)  =  0,  t  >  0  (4.4) 

2  Jo 

□ 

Remark  4.2.3  The  value  of  Lc  at  state  xq  is  the  minimum  amount  of  control 
energy  required  to  reach  the  state  xq  from  0.  The  value  of  L0  at  state  xq  is  the 
amount  of  output  energy  generated  by  the  system ’s  natural  response  to  initial  state 
Xq.  □ 

Remark  4.2.4  There  are  other  definitions  of  L0  for  which  the  input  u  plays  a 
direct  role.  These  can  be  considered  closed  loop  generalizations  of  (4-4),  which 
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then  corresponds  to  an  open  loop  observability  function.  For  details  and  results 
see  [58,  142].  In  this  thesis  we  use  only  the  L0  given  by  (4-4)-  □ 

Fact  4.2.5  In  the  case  of  an  LTI  system  (3.76)  the  controllability  and  observability 
functions,  respectively,  specialize  to  the  quadratic  functions 


Lc{x 0)  = 

^x0TWc  1  x0 

(4.5) 

Lo(x 0)  = 

^  x0T  W0  x0 

(4.6) 

where  the  symmetric  positive- definite  matrices  Wc  and  WQ  are,  respectively,  the 
familiar  controllability  and  observability  Gramian  matrices,  given,  respectively, 
by  (3.78)  and  (3.79)  (see  [140,  141]).  □ 

4.2.1  Properties 

There  are  several  properties  of  the  controllability  and  observability  energy  func¬ 
tions  that  we  will  use  in  our  computational  effort.  It  is  clear  that  both  of  these 
functions  are  non- negative  (vanish  only  at  0).  However,  they  are  not  necessarily 
finite  everywhere  in  a  neighborhood  of  the  origin,  nor  is  the  minimum  at  0  neces¬ 
sarily  11011-degenerate,  i.e.,  isolated.  We  need  both  Lc  and  L0  to  be  finite  (i.e.,  to 
exist)  and  11011-degenerate  in  a  neighborhood  of  0  in  order  to  perform  the  balancing 
computations.  We  now  discuss  conditions  under  which  the  energy  functions  enjoy 
those  properties. 

Theorem  4.2.6  (Scherpen  and  Gray  [142])  Suppose  that  f  is  asymptotically 
stable  on  a  neighborhood  W  of  0.  Then  Lc(x)  is  smooth,  finite,  and  satisfies 
Lc(x)  >  0  for  x  G  W,  x  7^  0  if  and  only  if  the  system  (4-l)-(4-2)  is  asymptot¬ 
ically  reachable  from  0  on  W .  □ 
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Remark  4.2.7  It  makes  intuitive  sense  that  a  reachability  property  determines 
whether  Lc  is  finite,  since  if  we  have  a  state  xq  that  is  not  reachable,  the  minimum 
in  (4-3)  will  fail  to  exist  at  x0.  In  that  case,  by  convention,  we  take  Lc(x 0)  =  oo. 
□ 

Remark  4.2.8  Non- degeneracy  of  Lc  is  guaranteed  by  asymptotic  stability  of  f . 
Intuitively,  some  positive  control  energy  must  be  required  to  steer  away  from  the 
asymptotically  stable  origin  to  states  in  some  open  neighborhood.  □ 

Interestingly,  bnt  not  surprisingly,  we  have  a  dual  situation  regarding  L0.  In  a 
reversal  of  the  situation  for  Lc,  non-degeneracy  (rather  than  finiteness)  of  L0  is  de¬ 
termined  by  an  observability  property,  and  finiteness  (rather  than  non-degeneracy) 
of  L0  is  determined  by  stability  properties.  We  first  state  the  condition  for  non¬ 
degeneracy  of  L0. 

Theorem  4.2.9  (Scherpen  [140])  Suppose  that  f  is  asymptotically  stable  on  a 
neighborhood  W  of  0.  If  the  system  ( 4-l)-(4-2 )  is  zero-state  observable  on  W,  then 
L0{x)  >  0  for  x  G  W,  x  ^  0.  □ 

To  ensure  that  La  is  finite,  we  use  stability  conditions  on  /  together  with  addi¬ 
tional  conditions  on  h.  It  turns  out  that  the  typical  situation  in  which  the  origin 
is  exponentially  stable  and  h  is  smooth  with  //(())  =  0  is  sufficient  to  guarantee 
that  L0  is  finite  on  a  neighborhood  IT  of  0.  In  fact,  we  derive  somewhat  weaker 
sufficient  conditions,  presented  in  the  following. 

Proposition  4.2.10  Suppose  that  f  is  asymptotically  stable  with  region  of  attrac¬ 
tion  W .  Let  xq  G  W .  Let  x{t )  be  the  unique  solution  to  the  initial  value  problem 

x  =  f(x)  x(0)  =  xq  (4.7) 
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Suppose  that  there  exists  a  time  L  >  0  and  positive  constants  a  and  j3  such  that 
for  all  t  >  ti 

II  x(t)  ||  <  a  exp  (—(3(t  —  ti))  (4.8) 

Suppose  that  h  is  locally  Lipschitz  on  W  with  h( 0)  =  0.  Define  y(t )  =  h  (. x(t ))  for 
t>  0.  Then  y  G  £2(0,  00). 


Proof  Since  h  is  locally  Lipschitz  on  W,  there  exists  a  constant  L  >  0  such  that 


h(x)  —  h( 0)  ||  <  L  ||  x  —  0 


(4.9) 


in  some  compact  neighborhood  V  C  W  of  0.  By  h( 0)  =  0  we  have  ||  h(x)  ||  <  L  ||  x  || 
for  all  x  G  V.  Let  U  =  V  D  B(0,ct).  Let  t2  =  min  {t  >  0  :  x(t)  G  U}.  Then  for 
t>t2 

||  y{t)  ||  =  ||  h(x(t))  ||  <  Laexp  (— f3  ( t  —  t\))  (4-10) 

and 


<t2 


y(t)\\~  dt  <  L2  a2  exp  (2  (3  ti)  f  exp  (—2  (3t)dt 
L2  a2 


>t2 


2(3 


exp  (2/3  (tj  -«2)) 


(4.11) 


Now,  by  asymptotic  stability,  x(t )  is  hnite  for  all  t  >  0  for  arbitrary  x0  G  W. 
Furthermore,  h  is  continuous  on  V  (by  Lipschitz  property  and  compactness  of  V). 
Therefore  y[t)  =  h  (x(t))  <  00  for  0  <  t  <  t2.  This  implies  that  for  some  7  >  0 


7  <  00 


(4.12) 


Thus, 

T  2  2 

Jo  II  y(t)  ||2  df  <  7  H  2^"  exp  (2  (3  (£1  —  t2)) ,  (4.13) 

i.e.,  y  G  £2[0, 00).  ■ 
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Remark  4.2.11  The  proposition  asserts  that  La  is  finite  on  W  whenever  all  of 
the  following  hold: 

(i)  The  equilibrium  0  is  asymptotically  stable  with  region  of  attraction  containing 
W. 

(ii)  There  exists  a  neighborhood  of  0,  possibly  smaller,  in  which  0  is  exponentially 
stable. 

(in)  The  function  h  is  at  least  locally  Lipschitz  on  W . 

□ 


Remark  4.2.12  Scherpen  [ljl]  considers  only  the  more  general  case  where  h 
is  smooth  and  states  the  more  conservative  condition  that  the  linearization  A  = 


(0)  be  asymptotically  stable,  i.e.,  that  0  be  exponentially  stable. 


□ 


Remark  4.2.13  Given  a  Lipschitz  h,  mere  asymptotic  stability  of  0  without  a 
neighborhood  of  exponential  stability  is  insufficient  to  guarantee  a  finite  L0.  Sys¬ 
tems  whose  linearizations  yield  eigenvalues  on  the  imaginary  axis  are  candidates 
for  belonging  to  such  a  class  of  systems.  For  example,  consider  the  system  x  =  —  x3 
with  h(x)  =  x  (smooth)  and  x(0)  =  0.  The  output  trajectory  is  y(t )  =  [l/y/2)  t _1/2 
so  that  ||  y(t)  ||2  0  as  t~l,  but  y  does  not  belong  to  £2(0,  00).  If  La  fails  to  exist 

at  x0  then  by  convention  we  take  Lo(x0)  =  00.  □ 


Remark  4.2.14  The  systems  we  work  with  generally  have  smooth  h  with  h( 0)  =  0. 
Moreover,  many  physical  systems,  including  mechanical  systems  with  damping, 
enjoy  exponential  stability.  Throughout  this  thesis,  we  usually  can  assume  that  the 
functions  Lc  and  L0  are  finite,  smooth,  and  non- degenerate  everywhere  in  their 
domain,  without  further  explanation.  □ 
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We  now  present  an  important  theoretical  result  that  we  use  later  to  analyze 
the  results  of  our  computational  methods.  The  functions  Lc  and  L0  each  satisfy  a 
familiar  nonlinear  PDE,  respectively,  a  Hamilton-Jacobi-Bellman  (HJB)  equation 
associated  with  an  optimal  control  problem,  and  a  nonlinear  type  of  Lyapunov 
equation. 

Theorem  4.2.15  (Scherpen  [140,  141]) 

(i)  Suppose  that  0  is  an  asymptotically  stable  equilibrium  of  —  (^f  +  g  gJ  ^ 

on  a  neighborhood  V  of  0.  Then,  for  all  x  G  V,  Lc  is  the  unique  smooth 
solution  of 

f)T  1  f)T  for  1  T 

0 = ~d£ {x)  f  {x) + 2  {x)  9  (x)  (:r)  ^  (:r)J  Lc(0)  =  0  (4-14) 

under  the  assumption  that  (f.lf)  has  a  smooth  solution  on  V. 

(ii)  Suppose  that  0  is  an  asymptotically  stable  equilibrium  of  f(x)  on  a  neigh¬ 
borhood  W  of  0.  Then,  for  all  x  G  W,  L0(x )  is  the  unique  smooth  solution 

of 

BL  1 

0  = (x)  f  (x)  + -hT  (x)  h(x)  L0( 0)  =  0  (4.15) 

under  the  assumption  that  (f.15)  has  a  smooth  solution  on  W . 

Proof  See  Remark  4.2.16  and  Appendix  F.  ■ 

Remark  4.2.16  Scherpen  [lfO,  1 4 1]  proves  Theorem  4-2.15  via  a  completing  the 
square  argument  and  straightforward  manipulation  of  Equations  (4-H)  and  (4-15) 
and  Definitions  4-3  and  4-4 ■  We  offer  a  different  proof  in  Appendix  F  that  appeals 
to  the  connections  between  Equations  (4-H)  and  (4-15)  and  optimal  control  theory. 
□ 
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Remark  4.2.17  Smooth  solutions  of  (4-H)  and  (4-15)  exist  locally  about  0  if  the 
matrix  A  = 

applicable.  □ 


(0)  is  Hurwitz,  so  we  will  assume  that  the  theorem  is  generally 


Remark  4.2.18  For  the  case  of  a  linear  system,  Equations  (4-M)  and  (4-15) 
specialize  to  the  matrix  Lyapunov  equations  (3.83)  and  (3.84)  (repeated  below) 

AWC  +  WCAT  +  B  Bt  =  0 
ATW0  +  W0A  +  CJC  =  0 


□ 


4.2.2  Remarks  on  Computation  and  Applications 

Here  we  briefly  point  out  some  of  the  difficulties  involved  with  computing  the 
controllability  and  observability  energy  functions,  and  discuss  some  straightforward 
but  likely  impractical  approaches. 

Suppose  that  we  have  suitably  discretized  the  state-space  in  such  a  way  that 
there  are  p  evenly  spaced  grid  points  along  each  of  the  n  dimensions.  This  means 
that  there  are  pn  total  grid  points  at  which  the  energy  functions  are  to  be  evaluated. 
We  denote  the  set  of  discrete  grid  points  by  X  C  Rn. 

A  direct  computation  of  Lc  from  Definition  4.3  requires  the  numerical  solution 
of  pn  optimal  control  problems  for  the  nonlinear  system  (4.1),  one  for  each  initial 
state  X()  G  X.  In  particular,  to  use  a  standard  solution  approach,  the  minimization 
problem  in  (4.3)  should  be  restated  so  that  the  optimal  control  problem  corresponds 
to  signals  in  positive  time,  i.e. ,  controls  in  £2(0,oo).  We  make  the  changes  of 
variables 

t  =  —t  z{t)  =  x(—r)  v(t)  =  u(—t )  (4.16) 
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for  t  <  0  and  r  >  0  so  that  (4.1)  and  (4.3),  respectively,  transform  to 

z(t)  =  -f(z(t))-g(z(t))v(t)  (4.17) 

and 

1  r°°  o 

Lc(x 0)  =  min  -  /  ||  v  (r)  ||:  dr  (4.18) 

2  Jo 

v  G  £2(0,  00) 

,z(0)  =  Xo  ,  2(00)  =  0 

Remark  4.2.19  For  the  system  (4- 17),  0  is  an  unstable  equilibrium.  Therefore, 
the  minimum  energy  control  which  takes  the  state  from  xo  7^  0  to  0  cannot  be 
u  =  0,  again  demonstrating  the  non- degeneracy  of  Lc.  □ 

There  exist  computational  methods  and  at  least  one  software  toolbox  (RI¬ 
OTS  [143])  for  solving  broad  classes  of  optimal  control  problems  such  as  (4.18). 
However,  regardless  of  the  computational  complexity  of  the  solution  algorithms, 
the  overall  computational  complexity  is  at  least  o{jpn ),  the  number  of  optimal  con¬ 
trol  problems  we  need  to  solve.  Even  for  very  low-order  systems,  the  computational 
expense  is  prohibitive. 

Remark  4.2.20  Similarly,  the  overall  computational  complexity  for  a  direct  com¬ 
putation  of  L0  from  Definition  4-4  i s  at  least  o(pn).  However,  for  each  x  G  X , 
rather  than  a  numerical  solution  of  an  optimal  control  problem,  we  merely  need  a 
numerical  integration  of  the  system  equations.  Although  still  impractical  in  general, 
it  is  feasible  for  low-order  systems.  □ 

Another  computational  approach  that  immediately  comes  to  mind  is  numerical 
solution  of  the  nonlinear  PDE  (4.14)  for  the  value  function  Lc.  Equation  (4.14) 
is  of  Hamilton-Jacobi  type  (see  [47,  93]  for  an  extensive  analysis  of  this  type  of 
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equation,  including  existence  and  uniqueness  results,  and  properties  of  solutions) 
and  of  the  general  form 

H(x,  u,  Du)  —  0,  x  e  D  u  —  (f),  x  G  dfl  (4.19) 

where  the  function  H  is  called  the  Hamiltonian,  u  is  the  unknown  function,  Du 
denotes  the  gradient  of  u,  is  the  domain  of  definition  for  u,  and  0  is  a  prescribed 
boundary  condition.  In  the  case  of  (4.14)  we  have  u  =  Lc  and  hi  =  JR”.  Rather 
than  a  boundary  condition  we  have  the  supplemental  condition  Lc( 0)  =  0. 

These  types  of  equations  are  in  general  nonlinear  first-order  problems  for  which 
there  is  no  hope  to  find  classical  solutions  (i.e.,  a  solution  of  class  C1  at  least). 
Instead,  one  must  deal  with  suitable  generalized  solutions  (i.e.,  locally  Lipschitz 
on  fl,  continuous  on  Q,  and  almost  everywhere  differentiable).  The  correct  class 
of  generalized  solutions  was  established  by  Crandall  and  Lions  in  [33,  93].  There 
they  introduced  the  notion  of  the  viscosity  solution  of  nonlinear  first-order  PDEs 
which  are  the  generalized  solutions  of  primary  interest  in  many  areas  of  application 
including  this  one.  Briefly,  under  certain  hypotheses,  for  e  >  0,  the  solution  ue  of 

H(x,  ue,  Due)  —  e  A ue  —  0,  x  G  Q.  u€  =  0,  x  G  dD  (4.20) 

approximates  the  viscosity  solution  of  (4.19)  with  error  estimate 

\ue(x)  —  u(x)\  <  cy/e  (4-21) 

for  some  constant  c. 

Crandall  and  Lions  [32]  give  finite-difference  schemes  for  approximating  these 
viscosity  solutions  along  with  error  estimates.  Souganidis  [153]  establishes  results 
concerning  the  convergence  of  explicit  and  implicit  finite-difference  schemes  to 
viscosity  solutions.  However,  except  when  dealing  with  low-dimensional  state- 
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spaces,  the  finite-difference  schemes  become  impractical  as  the  number  of  grid 
points  becomes  prohibitively  large. 

Remark  4.2.21  Similarly,  finite- difference  schemes  for  solution  of  the  nonlinear 
PDE  (4-15)  become  impractical  in  higher  dimensions.  □ 

4.3  Stochastic  Methods  for  Computation 

We  seek  a  method  for  computing  the  controllability  energy  function  without  solving 
the  family  of  optimal  control  problems  implied  in  its  definition,  or  solving  the 
associated  HJB  equation.  In  this  section  we  offer  an  approach,  based  primarily  on 
the  theory  of  stochastically  excited  dynamical  systems,  for  computing  an  estimate 
of  the  controllability  function.  We  show  that  in  certain  situations  the  method 
provides  an  exact  solution. 

4.3.1  Stationary  Densities  and  the  Controllability  Func¬ 
tion 

In  Section  2.6  we  introduced  the  notion  of  a  stochastically  excited  dynamical  sys¬ 
tem,  i.e.,  a  control  system  for  which  the  m  components  of  the  input,  Ui,  i  G  m,  have 
been  replaced  by  the  sample  paths  of  m  Gaussian  white  noises,  |(Ct)o  t  £  1R+}, 
i  G  to.  The  state  equation  is  given  by 

d  m 

-Xt  =  f{Xt)  +  Y,9i{Xt){Ct)i  (4.22) 

Recall  that  the  white  noise  driven  system  (4.22)  is  interpreted  correctly  via  the 
SDE  (given  elementwise) 

m 

(■ dXt )t  =  f  (Xt)  dt  +  ]T  9l  (Xt)  (, dWt )  •  i  G  n  (4.23) 

i= 1 
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where 


1  n  m  o 

/.(*<)=  /iW+jEE|«)»W  (4.24) 

and  —  (gj)  ■,  i  &  n,  j  &  m,  and  where  (4.23)  is  defined  in  terms  of  a  corresponding 
stochastic  integral. 

Recall  also  that  the  state  Xt  is  a  Markov  process  with  transition  probability 
density  p(x,  t;  y,  s ).  Time  evolution  of  p(x,  t]  y,  s )  is  governed  by  the  Fokker-Planck 
equation,  given  by 

d  'P  ,  .  v 
—  ( x,t;y,s )  = 

n  o 

-  E  iT  (/i(®)i>M;y,s)) 

2=1  4 

72  71  ^2 

+  2^212  dx-dx  ( bij  (x) p  s))  (4-25) 

2=1  jf  =  l  1  3 

with  initial  condition 

P  (x,  s;  y,  s)  —  S  (x  —  y) 


and  where 


III 

bij  ( x )  =  E  (x)  9jk  (x)  =  [[g  (x)]  \g  (x)]T] 


Finally,  recall  that  in  the  steady-state,  the  probability  density  is  stationary, 
and  Equation  (4.25)  simplifies  to  the  stationary  Fokker-Planck  equation 


no  -inn 

0  -E  jjy,  {fi(x)poo (®))  +  5  E  E 

2=1  ^ 


2  frif^dxi  dXj 


(hj  (x)  Poo  (x))  (4.26) 


where  p^  ( x )  denotes  the  stationary  probability  density  (if  it  exists). 

In  this  section,  we  propose  a  method  for  computing  the  controllability  func¬ 
tion  that  relies  heavily  on  the  above  framework.  We  are  motivated  initially  by 
observations  concerning  the  relationship  between  the  controllability  function  of  a 
linear  system  and  the  stationary  density  of  the  corresponding  linear  stochastically 
excited  system. 
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Linear  Case 


The  solution  of  a  linear  stochastic  initial  value  problem  is  given  by  a  variation  of 
constants  formula  for  the  state  process.  Moreover,  the  time  evolutions  of  its  mean 
and  its  covariance  are  governed  by  a  pair  of  ODEs.  These  results  are  summarized 
in  the  following  theorem  (see,  e.g.,  [36]). 

Theorem  4.3.1  Let  {Wt}  be  an  m-vector  process  with  stationary  orthogonal  in¬ 
crements,  xo  an  n-vector  random  variable  orthogonal  to  LLw  with 

Ho  =  E[x  o]  (4.27) 

Ro  =  E  (x0  -  Ho)(xo  -  ho)T  (4.28) 

and  A(-),  £>(•)  matrices  of  dimension  nxn  andnxm,  respectively,  whose  elements 
are  piecewise  continuous  real-valued  functions.  Then  the  stochastic  initial  value 
problem 

dXt  =  Aft)  Xt  dt  +  Bft )  dWt  X0  =  x0  (4.29) 

has  the  unique  solution 

Xt  =  $(t,  0)  Xq  +  [  &(t,  s)  B(s)  dWs  (4.30) 

Jo 

where  $(•,  •)  is  the  transition  matrix  corresponding  to  A(-).  The  moments  pit)  = 
E[Xt]  and  Rft )  =  E  (Xt  —  p(t))(Xt  —  p(t))T  are  the  unique  solutions  of  the 
initial  value  problems 

fi(t)  =  Aft )  lift)  fi(0)  =  Ho  (4.31) 

Rft)  =  Aft)  Rft)  +  Rft)  AT(t)  +  B{t)  BJft)  R{ 0)  =  R0  (4.32) 

□ 
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Remark  4.3.2  It  is  well  known  that  the  response  of  a  linear  system  to  a  Gaus¬ 
sian  stochastic  input  such  as  the  white  noise  process  is  also  a  Gaussian  process 
(see,  e.g.,  [126]).  Thus,  Theorem  4-3.1  implies  that  the  state  process  Xt,  given 
by  (4-30),  is  Gaussian,  with  mean  p(t)  and  covariance  R(t)  evolving  according 

to  (4-31)  and  (4-32),  respectively.  Additional  moments  are  not  required  to  charac¬ 
terize  the  density.  □ 

We  can  immediately  apply  this  result  to  the  case  of  an  asymptotically  stable 
LTI  system. 

Corollary  4.3.3  Consider  the  LTI  system  realization  ( A ,  B,  C )  with  A  stable  and 
controllability  Gramian  matrix  Wc.  Suppose  that  we  replace  the  input  signal  u(t ) 
with  the  sample  function  of  a  Gaussian  white  noise  process  {Ct}-  Suppose  that  the 
evolution  of  the  white  noise  driven  system  generates  the  state  process  Xt  with  mean 
p  ( t )  and  covariance  R  ( t ) .  Then  the  mean  and  covariance  satisfy 

hm  p  (t)  =  0  (4.33) 

hm  R  ( t )  =  Wc  (4.34) 

Proof  The  key  observation  is  that  (4.32)  is  also  satisfied  if  we  replace  R(t)  with 
the  finite-time-horizon  Gramian  matrix 

W(t)  =  I  exp  (As)  B  BJ  exp  ds  (4.35) 

and  let  W(0)  =  0  (see,  e.g.,  [72]).  By  asymptotic  stability,  R(t)  =W(t)  — »  Wc  as 
t  — t-oo.  In  addition,  asymptotic  stability  together  with  Equation  (4.31)  implies 
that  p(t)  — »  0  as  t  — >  oo.  ■ 

Thus,  for  a  stochastically  excited  LTI  system,  the  transition  density  function 
p(x,  t ;  y,  s )  describing  the  random  properties  of  the  state  process  Xt  is  Gaussian. 
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The  stationary  density  p0 c(x)  has  zero  mean  and  covariance  equal  to  the  control¬ 
lability  Gramian  matrix  Wc,  i.e., 

Poc(x)  =  [(2vr)n  det  (WC)Y1/2  exp  xT  W^1  x)  (4.36) 

Recall  that  in  the  LTI  case,  the  controllability  function  Lc  is  given  by  the 
quadratic  form  in  Equation  (4.5).  Thus,  in  the  LTI  case,  the  controllability  function 
Lc  and  the  stationary  density  are  related  exactly  by 

Poo{x)  =  [(27r)n  det  ( Wc)]~l/ 2  exp  (-Lc(x))  (4.37) 

and 

Lc(x)  =  -log  (poo(x))  +  log  ([(27r)re  det  (bbc)]^2)  (4.38) 

Nonlinear  Setting 

In  the  nonlinear  setting,  the  density  p(x,t',y,  s),  and  in  particular  the  stationary 
density  Poc(x),  are  not,  in  general,  Gaussian,  nor  determined  completely  by  their 
mean  and  covariance,  i.e.,  higher  order  moments  are  involved.  However,  because 
the  balancing  coordinate  transformation  is  local  to  a  neighborhood  of  the  origin, 
we  are  mainly  interested  in  capturing  a  local  characterization  of  the  controllability 
function.  In  light  of  this,  Equation  (4.38)  suggests  that  a  useful  approximation  of 
Lc  is  defined  by 

L'c(x)  =  -log  (poo(x))  +  C  (4.39) 

where  C  is  a  normalizing  constant,  dependent  on  the  particular  system,  such  that 
L'c{ 0)  =  0.  By  Equation  (4.38),  L'c  specializes  to  the  exact  Lc  in  the  LTI  case. 

Remark  4.3.4  The  approximation  L'c  captures  the  nonlinearity  intrinsic  to  the 
realization  ( f,g ),  which  is  manifested  in  the  stationary  density  p^  through  the 
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evolution  of  the  nonlinear  stochastically  excited  system.  It  provides  a  useful  work¬ 
ing  approximation,  i.e.,  a  reasonably  accurate  measure  of  the  degree  to  which  state 
components  are  controllable  in  a  neighborhood  of  the  origin.  Substitution  of  the  ap¬ 
proximation  L'c  for  Lc  into  the  Scherpen  balancing  procedure  results  in  a  realization 
that  is  not  balanced,  but  nearly  balanced.  The  property  of  equal  controllability  and 
observability  of  state  components  is  satisfied  so  closely  that  the  attractive  properties 
of  such  a  realization  in  terms  of  model  reduction  are  still  enjoyed.  □ 

There  exist  certain  nonlinear  systems  for  which  Equation  (4.39)  provides  an 
exact,  rather  than  approximate,  formula  for  the  controllability  function.  As  one 
simple  example,  consider  the  process  Xt  governed  by  the  first-order  SDE 

dXt  =  -  V  4>{Xt)  +  dWt  (4.40) 

where  (f>  :  Rr'  — »  ]Rn  is  a  C 1  map  such  that  —V  0  is  asymptotically  stable. 

Remark  4.3.5  In  general,  a  process  modeled  by  a  first-order  SDE  is  referred  to  as 
a  Langevin  process  or  an  Ornstein-Uhlenbeck  process.  The  terminology  originates 
with  the  equation  of  Langevin  [89],  which  describes  the  motion  of  a  free  particle  in  a 
viscous  fluid,  where  the  random  noise  models  the  impulsive  forces  due  to  collisions 
between  the  fluid  molecules  and  the  free  particle.  □ 


The  stationary  Fokker-Planck  equation  for  the  steady-state  transition  density 
Poo  of  the  Langevin  process  governed  by  (4.40)  is  given  by 


0  =  iy  (ri  +  y  Xt  ix)  „  ix)  +  Y  tt  (x)  X2  ix) 

2  f  dx?  W  +  h  a*2  (  >  P°° 1  '  h  3xi  W  dx ,  (X> 


(4.41) 


i.e., 


0  =  -  Apoo  (x)  +  Poo  (x)  A 4>  (. X )  +  (V  Poo  (x))  V  f  (x) 


(4.42) 
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Proposition  4.3.6  The  density  function 


P™B(X)  =  C exp  (-2  </>(x))  (4.43) 

satisfies  the  stationary  Fokker-Planck  equation  (4-f2)  where  C  is  a  constant  such 
thatfpfB  =  1. 

Proof  Equation  (4.42)  follows  directly  from 

Vp™(s)  =  —  2  Cexp  (— 2  (x))  V  (x) 

=  -2pfB(x)  V0(x)  (4.44) 


and 


A  p\ 


MB 


X  = 


=  -2 


2  V  •  (pf3  (x)  V<j>{x) 

P™B  (x)  A 0  (x)  +  (V pfB  (x))T  V  cf  (x) 


(4.45) 


Remark  4.3.7  2l  density  of  the  form  (f.f3)  is  referred  to  as  a  Maxwell- Boltzmann 
density.  It  originally  appeared  in  the  work  of  Maxwell  and  Boltzmann  on  modeling 
heat  in  a  medium  as  the  random  motion  of  the  constituent  molecules,  where  <j) 
represents  the  total  energy  in  the  system.  The  steady-state  density  describing  the 
random  properties  of  molecule  positions  and  velocities  is  of  the  form  (f.f3).  □ 

Now,  using  (4.39),  define 

4MB  (x)  =  “loS  +  log  ( C )  =  2  <l>(x)  (4-46) 

Proposition  4.3.8  The  function  Iff3  satisfies  the  HJB  equation  (4-H)  and  thus 
is  the  unique  controllability  energy  function  for  the  Langevin  system,  i.e.,  stable 
affine  nonlinear  system  with  f(x)  =  — V  <f>(x)  and  g{x)  =  1. 
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Proof  The  statement  follows  from  straightforward  substitution.  ■ 

Systems  modeled  by  the  first-order  SDE  (4.40)  do  not  comprise  a  sufficiently 
general  class  to  be  useful  in  many  situations  of  interest.  In  particular,  if  g(x)  7^  I, 
the  density  does  not  generally  satisfy  the  stationary  Fokker-Planck  equation 
and  the  function  LVMB  does  not  generally  satisfy  the  HJB  equation.  In  the  next 
section,  we  seek  conditions  under  which  a  broader  class  of  systems  admits  an  exact 
relationship  between  the  stationary  density  and  the  controllability  function. 

4.3.2  Second-Order  Mechanical  Systems 

In  this  section  we  determine  conditions  under  which  the  controllability  function 
for  a  second-order  mechanical  system  can  be  expressed  exactly  in  terms  of  the 
stationary  density  for  the  corresponding  stochastically  excited  system.  We  adopt 
and  modify  somewhat  the  notation  and  framework  of  Fuller  [51]  and  Zhu  and 
Yang  [171].  These  authors  have  presented  conditions  under  which  exact  solutions 
of  the  stationary  Fokker-Planck  equation  can  be  derived.  We  show  that  in  certain 
cases,  the  same  conditions  are  sufficient  for  expressing  the  controllability  function 
in  terms  of  the  stationary  density,  while  in  other  cases,  additional  conditions  are 
required. 

Hamiltonian  System  Perturbed  by  Dissipation  and  Forcing 

We  consider  a  forced,  dissipatively  perturbed,  n-DOF  Hamiltonian  system  as  de¬ 
scribed  in  Section  2.7.  Let  q  =  (qi,...,qn)  £  Rn  and  p  =  (pi, . . .  ,pn)  G  Rn 
denote,  respectively,  the  generalized  displacements  and  momenta.  Let  the  Hamil¬ 
tonian  H'  =  H'  (q,p),  i.e.,  the  sum  of  the  kinetic  and  potential  energies  of  the 
system,  be  C 2 .  Let  d%]  =  d%]  ( q,p )  for  i,  j  G  n  be  C 1  functions  representing  gener- 
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alized  nonlinear  dissipation  coefficients.  Let  dij  =  dtJ  ( q,p )  for  i,j  G  n  be  C 2.  The 
system  that  we  consider  is  governed  by  the  equations  of  motion,  for  %  G  n 

dH' 


Qi  =  -K— 

OPi 

dH'  "  ,  dH'  ™  , 


(4.47) 

(4.48) 


The  system  is  realized  in  standard  state-space  form  with  coordinates  x  = 
(q,p)  G  R2"  and 


,  dH' 

Ji  =  i  =  l,...,n 

OPi 

f  dH'  ^  ,  dH' 

Ji  =  — « - 2_,  cij  *  =  n  +  1, . . . ,  2n 

dqt  pi  3  dpj 

(. 9k)i  =  0  i  —  k  =  l,...,m 

(. 9k)i  =  dik  i  —  n  +  1, . . . ,  2n;  k  =  l,...,m 

The  output  map  h  is  irrelevant  for  purposes  of  the  discussion  here. 


(4.49) 


(4.50) 


Stochastically  Excited  System 


The  corresponding  stochastically  excited  system  is  governed  by  the  SDEs,  for  iGn 

(4,51) 


dH'  , 
dQi  =  dt 


dPi 


ik 


dH'  *  ,  dH'  Iff  ddt 
dP'  =  + 


E  dik  (dWt)i 


(4,52) 


fc=  1 


where  we  have  adopted  the  usual  notation  by  substituting  Q  for  q  and  P  for  p 
when  dealing  with  the  corresponding  random  variables. 

1  n  m 

It  is  usually  the  case  that  the  correction  terms  -  ^  ^  — L-  djk,  i  G  n,  can  be 

2  j= i  k=i  ®Pj 

split  into  two  parts:  one  which  modihes  the  conservative  forces  and  the  other  that 
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modifies  the  damping  forces  (see  [171]).  We  assume  that  this  can  be  accomplished, 
and 


(i)  combine  the  first  part  with  —  — —  to  form  effective  conservative  force  terms 

9H  dH  dHf 

——  with  a  new  Hamiltonian  H  =  H  (Q,  P )  such  that  — —  =  — — — ; 
oQi  aPi  dPi 

n  dH' 

(ii)  combine  the  second  part  with  ^  ci3  — —  to  form  effective  dissipative  force 

j= i  cI> 

n  Off 

terms  ^  Cij  — —  with  new  damping  coefficients  =  ctJ  (Q,  P). 
j  i  ()I  i 

Equations  (4.51)  and  (4.52)  can  be  rewritten,  for  i  G  n 


dH  , 

dQi  ~  m  dt 


dp,  =  - 


+  E 


OQi  iH  OP, 


dt+J2dik  ( dWt)i 


(4.53) 


(4.54) 


Stationary  Fokker-Planck  Equation 


The  stationary  Fokker-Planck  equation  governing  the  stationary  transition  density 


Poo  =  Poo  ( q,p )  associated  with  the  SDEs  (4.53)  and  (4.54)  is  given  by 
"  [  0  ( OH  \  0  (OH  V 

§  [~dQi  { mPooj +  m  \0QiP °°J. 


”  0  /  "  OH  \  1  "  02 

§  Cy'  5P,-  H  +  2  §  ap, 


where 


m 

bij  )  )  dik  djk  d  d 


(4.55) 


and  subject  to  boundary  conditions  (vanishing  probability  flow) 


,  lim  W77  Poe  = 

i<hp)  II — >-00  a  Pi 


(4.56) 


(OH  "  OH  \  1  "  0  , 

I  I  W,  +  §  +  2  g  (6«  *->  = 0 


(4,57) 
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Observe  that  the  first  summation  term  on  the  right-hand-side  of  (4.55)  is  equal 


to  the  Poisson  bracket  of  p^  and  H ,  i.e. , 

n 

{Poo.ff}  =  £ 
i= 1  L 

n 

=  E 


DU  dpao  (JII  dp0 


dPi  dQi  dQi  dPi 


i=l 


d  (8H 


dQi  \dPi 


Pc 


d  ( dH 


dPi  \dQi 


Pc 


Thus,  we  can  rewrite  the  stationary  Fokker-Planck  equation  (4.55)  as 


0  =  {poo,H}  + 


i= 1 


d 


E 


dH 


dPt  u  V "  dPj 


■Pc 


1  "  d2 

2  E  dP.  dPj 


0 kj  p0 


(4.58) 


(4.59) 


Existence  of  Smooth  Stationary  Densities 


Before  proceeding,  we  must  establish  the  existence  of  a  smooth  stationary  density 
in  this  special  case  of  a  stochastically  excited  and  dissipatively  perturbed  Hamilto¬ 
nian  system.  We  appeal  to  Theorem  2.6.17.  First  we  show  that  the  corresponding 
deterministic  system  has  the  required  property  of  local  strong  accessibility. 

A  system  of  the  form  (4.47)- (4.48)  can  be  derived  from  the  equations  of  motion 
(see  Section  2.7) 

M  (q)  q  +  C  (q,q)  +  N  (q,q)  =  F  (4.60) 

Let  qd  and  qd  represent  desired  position  and  velocity  trajectories.  Define  e  =  q  —  qd 
as  the  error  between  the  actual  and  desired  trajectories.  Consider  the  control  law 

F  =  M(q)  (qd  -Kve-  Kp  e)  +  C  (q,  q)  +  N  (q,  q)  +  w  (4.61) 

where  Kv  and  Kp  are  constant  gain  matrices  and  w  is  an  exogenous  input  vector. 
The  resulting  error  dynamics  can  be  written  as 

M  (q)  (e  +  Kv  e  +  Kp  e)  =  w  (4.62) 

Since  M  is  positive  definite  for  all  q,  we  can  write 

e  +  Kv  e  +  Kp  e  =  M^1  (q)  w  (4.63) 
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We  can  choose  Kv  and  Kr  so  that  this  linear  ODE  governing  the  error  yields  a 
stable,  controllable  LTI  system.  Moreover,  for  the  LTI  system,  controllability  is 
equivalent  to  local  strong  accessibility.  Thus,  using  the  above  feedback  transfor¬ 
mation,  we  may  conclude  that  the  forced,  dissipatively  perturbed,  Hamiltonian 
system  with  equations  of  motion  (2.115)  is  locally  strongly  accessible. 

Remark  4.3.9  The  control  law  (4-61)  is  often  referred  to  as  the  computed  torque 
control  law  in  the  robotics  literature  (see,  e.g.,  [112]).  □ 

It  is  clear  that  the  completeness  property  is  satisfied  in  the  case  that  (4.60) 
yields  a  linear  system.  We  now  argue,  formally,  that  there  are  many  interesting 
cases  in  the  nonlinear  setting  for  which  the  vector  fields  in  the  Lie  algebra  generated 
by  {/,  gi,  ■  ■  • ,  gn}  are  complete. 

Observe  that  the  Hamiltonian  is  typically  of  the  form 

H  (q,p)  =  pJ  M  (g)_1  p  +  U  (q)  (4.64) 

By  the  conservation  law  H  =  0  for  the  system  with  zero  dissipation,  we  have  that 
q  lies  within  the  interior  a  n-dimensional  torus  (a  compact  set)  for  all  time  (with 
zero  dissipation  q  lies  on  the  torus;  otherwise  within  the  interior).  Furthermore, 
p  lies  within  the  interior  of  a  compact  set  formed  by  taking  a  union  of  ellipsoids 
parameterized  by  q  for  all  time.  Thus,  there  exist  no  exploding  solutions  of  x  = 
f{x). 

Now,  consider  the  case  where  the  are  constant  vector  fields.  This  situation 
is  a  common  one,  e.g.,  torque  inputs  at  the  joints  of  a  serial  manipulator.  Clearly, 
such  vector  fields  produce  no  exploding  solutions.  On  the  other  hand,  it  is  possible, 
due  to  the  nonlinear  mass  matrix  M  (q),  for  brackets  of  the  form  [/,  g.[\  to  produce 
vector  fields  that  are  not  complete.  For  purposes  of  this  discussion,  we  take  the 
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position,  without  justification,  that  such  brackets  will  often  produce  complete  vec¬ 
tor  fields,  and  assume  that  smooth  transition  densities  exist  for  the  systems  with 
which  we  work. 


Constant  Parameter  Case 


We  first  consider  the  case  where  the  parameters  c';  and  d,k  are  independent  of  q 
and  p,  i.e.,  are  constants.  In  this  situation,  the  correction  term  vanishes,  H  =  H1, 
and  Cij  =  ch.  The  following  is  modified  from  Fuller  [51]. 

Theorem  4.3.10  (Fuller  [51])  Consider  the  stochastically  excited  system  corre¬ 
sponding  to  the  forced,  dissipatively  perturbed,  n-DOF  Hamiltonian  system  gov¬ 
erned  by  the  SDEs  (f.53)-(f.5f)  where  H  is  the  Hamiltonian.  Suppose  that  the 
coefficients  cl3  and  dik  are  independent  of  q  and  p.  Furthermore,  suppose  that  the 
following  constant  ratio  holds  for  all  i,j  G  n: 

jd-  =  £  =  constant  (4.65) 

Then  the  unique  stationary  density  p^  that  satisfies  Equation  (4-59)  is 

Poo(q,p)  =  Cexp(-2£H  (q,p))  (4.66) 

where  C  is  a  constant  such  that  f  p ^  =  1. 


Proof  Observe  that  p^  is  a  functional  of  H,  i.e.,  p^  =  p^  (H(q,p)),  which 
implies  by  Lemma  2.7.6  that  { p^ ,  H}  =  0.  Observe  also  that 


^  =  _2l  —  v 

a Pi  sp,  Pc 


(4.67) 


Since  the  ci3  and  dij  are  constants,  the  stationary  Fokker-Planck  equation  (4.59) 
can  be  written 


n  8 


3= 1 


8H  1  u  dpoo' 
13  8P,  Po° +  2  lj  8P, 


(4.68) 
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By  Equations  (4.65)  and  (4.67)  we  have,  for  i,j  e  n 


dH 

°13  d  Pj  Po° 

1  >  dPoo 

+  2  lJ  dP ) 

dH  /J;  dH 

-  *3  dPjP°o  ebv  dPjP°° 

(4.69) 

-  (c--lb~)™ 

[  l3  t0l3>  dpj 

(4.70) 

=  0 

(4.71) 

Finally,  we  observe  that  p^  satisfies  the  boundary  conditions  (4.56)  and  (4.57).  ■ 

Remark  4.3.11  The  density  (4-66)  is  of  Maxwell-Boltzmann  form  (see 
Remark  f.3.7).  □ 

Remark  4.3.12  The  condition  (4-65)  is  referred  to  as  the  equipartition  of  en¬ 
ergy  condition.  The  terminology  derives  from  the  situation  in  statistical  mechan¬ 
ics  where  each  DOF  of  a  multi- particle  system  is  associated  with  the  same  mean 
energy.  □ 


Remark  4.3.13  The  equipartition  of  energy  condition  imposes  a  severe  restriction 
on  the  class  of  systems  for  which  (4-66)  is  the  stationary  density.  □ 


The  controllability  function  Lc  uniquely  satisfies  the  HJB  equation  (4.14), 
which,  for  realization  (/, g)  given  in  Equations  (4.49)  and  (4.50),  takes  the  form 


o  =  {lc,h}  +  J2 


n  1 dLc  n 


i= 1 


dPi  S 


dH  1  dLc ' 

+2  ij 


(4.72) 


The  relationship  between  the  stationary  density  and  the  controllability  function, 
and  an  exact  formula  for  the  latter  in  terms  of  the  Hamiltonian,  are  given  in  the 
following  result. 


Theorem  4.3.14  Consider  the  forced,  dissipatively  perturbed,  n-DOF  Hamilto¬ 
nian  control  system,  governed  by  the  evolution  equations  (4-4V  and  (4-48),  and 
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realized  by  ( f,g )  given  in  Equations  (4-49)  and  (4-50).  Under  the  conditions  stated 
in  Theorem  4-3.10,  the  unique  controllability  energy  function  for  the  system  is  given 
by 

Lc(q,p )  =  -iog(poo(g,p))  +  C' 

=  2  £H(q,p)  +  C'  (4.73) 

where  Poo  is  the  stationary  density  of  the  corresponding  stochastically  excited  system 
and  C'  is  a  constant  such  that  Lc  (0,  0)  =0. 


Proof  It  is  necessary  and  sufficient  to  show  that  Lc  satisfies  Equation  (4.72). 

Since  Lc  is  a  functional  of  H,  Lemma  2.7.6  implies  that  {LC1 II)  =  0.  Furthermore, 
dL  dH 

—3  =  2 1— — ,  so  that  Equation  (4.72)  becomes 

OPj  OPj 


dpi 


dPj 


dPj 


o  =  E 

i=  1 
n 

=  E 

i= 1 

which  is  clearly  satished  given  the  equipartition  of  energy  condition  (4.65). 


dPi  dpj 


(4.74) 

(4.75) 


General  Setting 

We  now  consider  the  more  general  situation  where  the  parameters  cb  and  dik  are 
permitted  to  be  functions  of  q  and  p.  The  following  is  modified  from  Zhu  and 
Yang  [171]. 

Theorem  4.3.15  (Zhu  and  Yang  [171])  Consider  the  stochastically  excited  sys¬ 
tem  corresponding  to  the  forced,  dissipatively  perturbed,  n-DOF  Hamiltonian  sys¬ 
tem  governed  by  the  SDEs  (4-53)-(4-54)  where  H  is  the  Hamiltonian.  Suppose  that 
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the  following  ratio  holds  for  all  i  E  n  and  for  some  functional  h  of  H: 

dH  dh; 


EU  2c 


+ 


V8F '>  '  =  h(H) 


T.U  h 


dH 


Ji=1  13  dPj 


Then  the  unique  stationary  density  that  satisfies  Equation  (4-59)  is 

(  fH(q,p)  \ 

Poo  (q,p)  =  Cexp  I  -J  h(u)duj 
where  C  is  a  constant  such  that  f  poo  =  1. 


(4.76) 


(4.77) 


Proof  Assume  that  there  exists  (f  (. H )  such  that 

Poo  ( q,P )  =  Cexp(-<f>(H(q,p))) 


Then  as  before  {poo,H}  =  0  and 


dpo 


dH 


(4.78) 


dPj 


dH  dPj 


Poo.  The  stationary  Fokker- 


Planck  equation  (4.59)  can  then  be  written 

t  (2cv  9H  dk 

i=  1  3=1  \ 


o  =  E 


+  —H  —  b  ■ 

'  r*  -r^  UIJ 


13  dpj  '  dpj 


dH  d(f ' 

Wi~dH 

J  / 


(4.79) 


d<f> 


which  is  clearly  satished  if  we  assign  — —  =  h  (. H )  where  h  (H)  is  dehned  by  Equa- 

oH 

tion  (4.76).  Thus,  the  desired  functional  (j)  is  obtained  through  integration  yielding 
Equation  (4.77).  ■ 


Remark  4.3.16  The  density  (4- 77)  is  of  Maxwell-Boltzmann  form. 


□ 


Remark  4.3.17  The  condition  (4-76)  is  analogous  to  an  equipartition  of  energy 
condition,  again  imposing  a  severe  restriction  on  the  class  of  systems  for  which 
(4.77)  is  the  stationary  density.  □ 


The  relationship  between  the  stationary  density  and  the  controllability  func¬ 
tion,  and  an  exact  formula  for  the  latter  in  terms  of  the  Hamiltonian,  are  given  in 
the  following  result. 
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Theorem  4.3.18  Consider  the  forced,  dissipatively  perturbed,  n-DOF  Hamilto¬ 
nian  control  system,  governed  by  the  evolution  equations  (4-4V  and  (4-48),  and 
realized  by  ( f,g )  given  in  Equations  (4-49)  and  (4-50).  Suppose  that  the  following 
ratio  holds  for  all  i  £  n  and  for  some  functional  r  of  H : 


En 

7=1 


^3 


dH 

dpj 


EU  b 


v 


dH 

dpj 


=  r(H ) 


(4.80) 


Then  the  unique  controllability  energy  function  for  the  system  is  given  by 

rH(q,p) 

Lc{q,p)  =  2  /  r(u)du  +  C'  (4-81) 

Jo 

where  C'  is  a  constant  such  that  Lc  (0, 0)  =  0.  Furthermore,  if  the  bij  are  indepen¬ 
dent  of  p  then 

Lc  (■ q,P )  =  -log  (poo(q,p))  +  C'  (4.82) 


where  p^  is  the  stationary  density  of  the  corresponding  stochastically  excited  sys¬ 
tem. 


Proof  Assume  that  Lc(q,p )  =  (f>(H(q,p))  for  some  functional  </>  of  H.  Then 
{ Lc ,  H}  =  0.  The  HJB  equation  (4.72)  can  be  written 


S  dPi  dH 


dH  d(p  (  dH  dcj) 


dH ' 
'j ) 


b lj  dpj  dH  ^  C‘j  dp 


df> 


(4.83) 


which  is  clearly  satisfied  if  we  assign  =  2  r  (. H )  where  r  (if)  is  defined  by 
Equation  (4.80).  Thus,  the  desired  functional  0  is  obtained  through  integration 
yielding  Equation  (4.81).  Moreover,  if  the  btJ  are  independent  of  p  then  2  r  (if)  = 
h(H )  where  h(H)  is  defined  in  Equation  (4.76).  In  that  case  Equation  (4.82) 
holds.  ■ 
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4.3.3  Monte-Carlo  Experiments 

In  situations  where  we  do  not  have  an  exact  formula  for  the  controllability  func¬ 
tion,  we  wish  to  use  the  approximation  given  by  Equation  (4.39)  in  the  nonlinear 
balancing  procedure.  This  requires  determining  the  stationary  density  Poo(x),  or  a 
suitable  estimate.  Approximating  p0 c(x)  via  Monte-Carlo  experiments  is  a  natural 
approach. 

Each  experiment  corresponds  to  a  numerical  simulation  of  the  white  noise 
driven  system  (4.22),  with  zero  initial  state  and  input  corresponding  to  a  dis¬ 
cretized  approximation  of  a  sample  path  for  Gaussian  white  noise.  The  numerical 
schemes  that  we  used  for  approximating  a  white  noise  signal  and  integrating  the 
SDEs  are  detailed  in  Appendix  C.  The  state  response  trajectory  Xt  for  each  ex¬ 
periment  is  sampled  and  recorded.  An  approximation  of  the  steady-state,  i.e., 
limt_5.00  A/,  is  generated  by  simulating  the  system  over  a  sufficiently  large  time 
period,  e.g.,  several  multiples  of  its  largest  time  constant. 

We  approximate  the  time  evolution  of  the  density  function  p(x.  t:  0,  0)  by  his- 
togramming  the  collection  of  trajectories  at  fixed  values  of  t.  Likewise,  we  approxi¬ 
mate  the  stationary  density  Poo(x)  by  histogramming  the  collection  of  steady-state 
responses.  Naturally,  the  approximations  improve  as  the  number  of  experiments 
in  the  collection  increases.  Moreover,  a  larger  set  of  experiments  allows  for  his¬ 
togramming  at  a  higher  resolution.  Statistics  of  the  density  such  as  /i(£),  /i^,  R(t), 
and  Roc  can  be  computed  and  analyzed  to  confirm  the  correctness  of  the  data. 

The  results  of  Monte-Carlo  experiments  to  approximate  the  controllability  func¬ 
tions  for  an  example  problem  are  presented  in  Section  4.6. 
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4.4  Computing  the  Morse  Coordinate  Transfor¬ 
mation 

Recall  that  for  an  LTI  system,  the  energy  functions  Lc  and  L0  globally  take  the  form 
of  quadratic  functions  given,  respectively,  by  (4.5)  and  (4.6).  We  wish  to  generalize 
the  linear  balancing  procedure  to  the  nonlinear  setting,  but  the  functions  Lc  and 
L0  are  not,  in  general,  quadratic.  However,  we  can  appeal  to  some  important 
results  from  critical  point  theory  (see,  e.g.,  [108])  in  order  to  find  a  change  of 
coordinates  under  which  a  smooth  function  takes  a  quadratic  form  locally  around 
a  non-degenerate  critical  point.  The  key  result  is  the  Morse  lemma  [110],  which 
guarantees  the  existence  of  the  desired  canonical  form  for  functions  with  a  non¬ 
degenerate  critical  point  defined  on  a  finite-dimensional  manifold,  and  an  analogous 
result  of  Palais  [125],  which  generalizes  the  notion  to  functions  defined  on  a  Hilbert 
space.  The  established  results  are  presented  from  various  points  of  view  in  [16,  57, 
60,  88,  108], 

4.4.1  The  Morse-Palais  Lemma 

The  functions  Lc  and  L0  are  smooth  real-valued  mappings  defined  on  local  coor¬ 
dinates  x  G  Rn  for  n-dimensional  manifold  M.  Thus,  we  can  use  the  fact  that 
the  local  behavior  of  a  smooth  real-valued  function  on  a  manifold  is  known  at 
almost  every  point  up  to  diffeomorphism.  To  see  this,  we  introduce  the  following 
terminology  (see  [57,  60]). 

Definition  4.4.1  (Critical  Point)  A  point  p  is  said  to  be  a  critical  point  of  the 

smooth  real-valued  function  f  if  the  partial  derivatives  with  respect  to  local  coordi- 
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nates  {x\, . . . ,  xn}  satisfy 

df 

■g—  (p)  —  0  i  G  n  (4.84) 

Otherwise,  the  point  p  is  said  to  be  a  regular  point  of  f .  □ 

Remark  4.4.2  If  a  point  p  is  regular,  then  we  can  invoke  the  implicit  function 
theorem  and  choose  a  coordinate  system  so  that  f  is  simply  the  first  coordinate 
function  in  a  neighborhood  ofp.  Thus  the  local  behavior  of  f  around  regular  points 
is  completely  characterized.  □ 

The  functions  Lc  and  La  each  have  a  critical  point  at  0.  We  now  focus  on 
characterizing  the  local  behavior  of  a  function  around  critical  points. 

Definition  4.4.3  (Non-degenerate  Critical  Point)  A  critical  point  p  of  the 
smooth  real-valued  function  f  is  called  non-degenerate  if  the  Hessian  matrix  of 
second  partials  at  p 

(4.85) 

is  nonsingular.  Otherwise  p  is  called  degenerate.  □ 

Definition  4.4.4  (Morse  Function)  A  smooth  real-valued  function  f  with  a  non¬ 
degenerate  critical  point  at  p  is  said  to  be  a  Morse  function  at  p.  □ 

Remark  4.4.5  Under  conditions  outlined  in  Section  f.2,  the  functions  Lc  and  L0 
are  Morse  functions  at  0.  □ 

Definition  4.4.6  (Index,  Nullity)  The  index  of  a  bilinear  functional  A  on  Rn 
is  defined  to  be  the  maximal  dimension  of  a  subspace  of  Rn  on  which  A  is  negative 
definite.  The  nullity  of  A  is  defined  as  the  dimension  of  the  nullspace  of  A.  The 
index  and  nullity  of  a  critical  point  p  of  function  f  are,  respectively,  the  index  and 
nullity  of  the  bilinear  functional  A(x,  y)  —  ( D2f(p)  x,y).  □ 


D2f(p)  = 


d2f 

dxi  dxj 


ip) 
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There  is  a  canonical  form  for  a  Morse  function  /  in  the  neighborhood  of  its 
non-degenerate  critical  point  p,  completely  described  by  the  index  of  p.  This  idea 
is  made  precise  in  finite-dimensions  in  a  theorem  by  Morse  [110]  and  generalized  to 
Hilbert  spaces  in  a  theorem  by  Palais  [125].  The  theorems  state  that  there  exists 
a  local  change  of  coordinates  under  which  a  Morse  function  is  quadratic  on  some 
neighborhood  of  its  non-degenerate  critical  point.  We  refer  to  this  result  as  the 
Morse-Palais  lemma,  for  which  we  present  a  version  based  on  that  in  Milnor  [108] 
and  Lang  [88].  Henceforth  we  assume  without  loss  of  generality  that  p  =  0. 

Theorem  4.4.7  (Morse-Palais)  Let  f  be  a  smooth  real-valued,  function  defined 
on  an  open  neighborhood  O  of  0  in  the  Hilbert  space  £.  Assume  that  /( 0)  =  0  and 
that  0  is  a  non- degenerate  critical  point  of  f .  Then  there  exists  a  neighborhood 
U  C  O  of  0,  a  local  change  of  coordinates  <f>  on  U ,  and  an  invertible  symmetric 
operator  A  such  that 

f  (x)  —  ( A  <f(x) ,  <f(x) )  £  x  G  U  (4.86) 

□ 

We  defer  the  proof  momentarily  and  present  some  related  and  supporting  re¬ 
sults. 

Corollary  4.4.8  Let  f  be  a  smooth  real-valued  function  defined  on  an  open  neigh¬ 
borhood  O  of  0  in  the  Hilbert  space  £ .  Assume  that  /( 0)  =  0  and  that  0  is  a 
non- degenerate  critical  point  of  f .  Then  there  exists  a  neighborhood  U  C  O  of 
0,  a  local  change  of  coordinates  z  =  £(x)  on  U ,  and  an  orthogonal  decomposition 
£  =  T  +  such  that  if  we  write  z  =  £(x)  =  u  +  v  with  u  G  T  and  v  G  then 

f(z)=f(£(x))  =  (u,u)£-(v,v)£  xeU  (4.87) 

□ 
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Figure  4.1:  An  example  of  a  Morse  function  on  H2  (with  level  contours)  before 
and  after  transformation  to  spherical  quadratic  form. 

Remark  4.4.9  Consider  the  special  case  where  £  =  1R"  and  critical  point  0  has 
index  r.  Define  z  =  £(x)  for  x  G  U  and  'if  =  on  £([/).  Then  Corollary  4.4.8 
implies  that 

r  n 

fix)  =  f  (lf(z))  =  ~YZi  +  Y  zi  (4-88) 

i=  1  i=r-\- 1 

In  the  new  coordinates,  the  function  f  is  said  to  be  in  spherical  quadratic  form. 
The  transformation  is  illustrated  in  Figure  4-1.  □ 

Definition  4.4.10  (Morse  Coordinate  Transformation)  A  change  of  coordi¬ 
nates  if  satisfying  (4-88)  is  said  to  be  a  Morse  coordinate  transformation  for  f 
around  0.  □ 

The  original  proof  by  Morse  uses  the  Gram-Schmidt  orthogonalization  process 
which  is  essentially  a  coordinate- by-coordinate  induction  argument.  The  gener¬ 
alization  by  Palais  is  proved  without  a  coordinate-wise  procedure,  which,  as  we 
demonstrate  later,  is  advantageous  for  purposes  of  computation.  Moreover,  cer¬ 
tain  decompositions  of  smooth  functions  with  non-degenerate  critical  points  are 
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integral  to  the  proofs.  We  merely  outline  the  proofs,  emphasizing  the  decompo¬ 
sitions  since  they  are  essential  to  computing  the  desired  transformation.  For  a 
concise  presentation  of  the  Morse  lemma  and  proof  see  Milnor  [108].  The  Palais 
version  is  found  in  [88,  125]. 

The  following  lemma  provides  a  decomposition  of  any  smooth  real- valued  func¬ 
tion  defined  on  a  finite-dimensional  manifold.  It  is  a  simple  application  of  the  first 
fundamental  theorem  of  integral  calculus. 

Lemma  4.4.11  Let  f  be  a  smooth  real-valued  function  defined  in  an  open  con¬ 
vex  neighborhood  O  of  0  in  n-dimensional  manifold  M.  Then  there  exist  smooth 
functions  gi}  i  G  n  on  O  such  that 

n 

f(x )  =  ^  ft(x)  xt  iGn,  x  G  O  (4.89) 

i= 1 

Furthermore,  if  0  is  a  critical  point  of  f  then 

df 

ft(0)  =  0^(0)  (4.90) 

□ 


The  proof  is  instructive  in  that  it  shows  us  how  to  compute  one  such  collection 
of  functions  ft,  i  En. 

Proof  By  the  fundamental  theorem  of  calculus 


/  fa)  -  /  (0) 


(4.91) 


Define 


(4.92) 


to  yield  (4.89).  ■ 

Applying  Lemma  4.4.11  twice  to  /  around  a  critical  point  at  0  results  in  the 
following  decomposition. 
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Lemma  4.4.12  Let  f  be  a  smooth  real-valued,  function  defined  in  an  open  convex 
neighborhood  O  of  0  in  n-dimensional  manifold  M.  Let  0  be  a  critical  point  of  f . 
Then  there  exist  smooth  functions  hij,  i,  j  G  n  on  O  such  that 

n  n 

f(x )  hij(x)  xix3  LJ  e  IT  X  e  O  (4.93) 

2=1  j  =  1 

Moreover,  the  symmetry  property 


hij(x )  =  hji(x )  i,  j  G  n,  x  G  O 


(4.94) 


holds,  and  it  is  true  that 


hij  (0)  — 


i  d2f 


2  dxi  dxj 


(0)  —  x  D2f  (0) 


(4.95) 


□ 


We  now  return  to  the  proof  of  Theorem  (4.4.7).  The  argument  follows  from 
decomposition  (4.93).  We  denote  H(x)  =  [H{x)]^  =  [htj(x)\.  Some  details  are 
omitted  here  (see  [88]). 


Morse-Palais  Lemma  (Theorem  4.4.7) 

Proof  Non-degeneracy  of  critical  point  0  ensures  that  H( 0)  is  nonsingular. 
Equation  (4.86)  is  satisfied  if.  for  all  x  G  O,  we  define  A  by 

A(x)  =  H(0)x  (4.96) 


and  define  <f>  by 


4>(x)  =  C  (x)  x 


where 

(C  (x)*  A  (x)  C  (xfj  ( x )  =  H  (x)  x 


(4.97) 


(4.98) 
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and  C  ( x )  denotes  the  adjoint  operator.  The  desired  operator-valued  map  C  is 
given,  for  x  G  O,  by 

C(x)  =  B{x)l/2  (4.99) 

where  B  is  dehned  for  x  e  O  by 

B  (x)  x  =  H  (0)-1  H  (x)  x  (4.100) 

The  operator  square  root  in  (4.99)  is  guaranteed  to  exist  for  x  in  some  neighborhood 
U  C  O  of  0  because  B  (. x )  is  close  to  the  identity  operator  1  on  a  neighborhood  of 
0,  and  the  square  root  function  has  a  convergent  power  series  expansion  near  I.  ■ 

Remark  4.4.13  Corollary  4-4-8  then  results  directly  from  the  fact  that  operator  A 
is  symmetric,  positive  definite  on  F,  and  negative  definite  on  FL.  This  allows  for 
the  change  of  coordinates  z  =  A 1//2  x  on  F  and  z  =  —A1^2  x  on  F1-  to  yield  (4-88). 
□ 

4.4.2  Properties 

A  Morse  coordinate  transformation  if  for  /  around  0  is  not  unique.  This  can  be 
argued  as  a  consequence  of  the  non-uniqueness  of  the  functions  gi  in  the  decom¬ 
position  (4.89)  and  the  functions  hi3  in  the  decomposition  (4.93).  Consider  the 
isotropy  transformation  T  :  Rn  — »  R"  such  that 

T(x)x  =  x  x  G  Rn  (4.101) 

At  each  point  x,  the  isotropy  T(x)  is  a  pure  rotation  about  an  axis  passing  through 
the  origin  and  x.  Let  G(x)  =  (g1(x), . . . ,  gn(x)).  Then  Equation  (4.89)  implies  that 

f(x)  =  G(x )  x  =  G(x)  T(x)  x  =  G(x)  x  (4.102) 
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where  G(x)  =  G(x)T{x )  comprises  another  set  of  gt  satisfying  (4.89).  The  non- 
uniqueness  of  the  functions  hij  in  (4.93)  follows  as  an  immediate  consequence. 

Remark  4.4.14  Even  though  the  functions  Qi  and  are  not  unique,  their  values 
at  the  origin,  i.e.,  gi(0)  and  htJ(Q),  are  invariants  of  the  function  f  and  given  by 
first  and  second  derivatives,  respectively.  □ 


The  non-uniqueness  of  the  Morse  coordinate  transformation  can  also  be  shown 
from  the  following  viewpoint.  Consider  /  with  index  r,  and  rewrite  (4.88)  as 

f(if(z))  =  zJEz  (4.103) 

where  E  is  the  block  diagonal  matrix 


- 1 

1 

-3 

1 

o 

1 - 

o 

In— r 

E  = 


Let  ©i  G  O(r)  and  ©2  G  0(n  —  r)  and  define  the  block  diagonal  matrix 


0  = 


(4.104) 


©i 

i 

o 

i - 

O 

1 

CM 

® 

(4.105) 


and  the  change  of  coordinates 

if  (z)  —  if  (0  z)  z  G  V’-1  (U) 

Then 

/  ($(*))  =  (0^)T  E  (0z)  =  zT  (eT  EO)  z  =  zT  Ez 


(4.106) 


(4.107) 


Thus,  if  is  also  a  Morse  coordinate  transformation  for  /  around  0. 


Remark  4.4.15  A  function  f  with  non- degenerate  critical  point  0  admits  a  family 
of  Morse  coordinate  transformations,  parameterized  by  the  spaces  of  orthogonal 
matrices  0(r)  and  0(n  —  r).  It  is  not  clear,  however,  if  this  family  exhausts  the 
entire  collection  of  Morse  transformations  for  f  around  0.  □ 
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Example  4.4.16  To  illustrate  the  nonuniqueness  property,  consider  the  polyno¬ 
mial  function  f  on  R2 

/  (x)  =  3x^  —  x2x2  —  X1X2  +  2x2  +  5x2  —  2xix2  +  2x^  (4.108) 


which  has  a  non- degenerate  critical  point  at  (0,0).  Applying  the  decomposition 
(4-93),  i.e.,  /(x)  =  xT  H{x)  x,  yields  the  invariant 


H{  0)  = 


5  -1 

-1  2 


(4.109) 


One  valid  choice  for  H(x),  computed  via  repeated  application  of  (4-92),  is  given  by 


H(x)  = 


5  +  3xi  —  |x2  —1  —  |xi  —  |x2 

-1  -  \x\  -  \x2  2  -  \xi  +  2x2 


Two  other  valid  choices  are 


H(x)  = 


5  +  3xi  —1  —  0.5xi  —  0.5x2 

-1  —  0.5xi  —  0.5x2  2  +  2x2 


and 


H(x)  = 


(4.110) 


(4.111) 


(4.112) 


5  +  3xi  —  x2  —1 

—  1  2  +  2x2  —  xi 

Each  different  choice  of  H(x)  will  result  in  a  different  Morse  coordinate  transfor¬ 
mation  via  (4-100).  □ 


Remark  4.4.17  We  were  able  to  perform  the  above  calculations  for  a  given  (con¬ 
trived)  polynomial  f .  However,  in  general,  there  exist  no  closed  form  expressions 
for  functions  gi;  hij,  or  if,  even  if  we  have  a  formula  for  f .  Moreover,  for  our 
purposes,  the  Morse  function  f  may  be  known  only  at  discrete  points  on  a  grid. 
Thus,  we  always  apply  Equation  (4-92)  to  our  computations.  □ 
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It  is  worth  noting  that  not  every  function  ip  satisfying  (4.88)  is  smooth,  a 
condition  required  for  a  function  to  be  a  valid  change  of  coordinates.  Consider  the 
following  example. 


Example  4.4.18  Let  ip  be  a  function  on  U  such  that 


z  =  if  1(x)  = 


f(x) 


X 


x  x  G  U 


(4.113) 


Then  if  satisfies  (4-103)  with  E  =  Hn .  However,  the  first  derivative  of  ip^1  does 
not  exist  at  0.  The  function  ip  is  not  smooth  and  thus  is  not  a  valid  change  of 
coordinates.  □ 


Remark  4.4.19  Observe  that  the  proof  of  Theorem  4-4-7  also  uses  a  square  root 
operation,  but  avoids  any  associated  pitfalls  by  ensuring  that  the  square  root  operand 
remains  close  to  the  identity.  □ 


4.4.3  Algorithm 

We  present  here,  somewhat  loosely,  an  algorithm  for  numerical  implementation  of 
Theorem  4.4.7  and  Corollary  4.4.8.  The  algorithm  is  presented  more  rigorously  in 
Section  4.5.4  once  the  computational  framework  has  been  introduced.  The  algo¬ 
rithm  takes  a  Morse  function  /  and  returns  a  neighborhood  U,  Morse  coordinate 
transformation  (p ,  and  invertible  symmetric  matrix  A  under  which  /  takes  the 
desired  form  (4.86)  on  U.  An  additional  algorithm  takes  (p ,  A,  and  U  and  re¬ 
turns  a  coordinate  transformation  f  under  which  /  takes  the  spherical  quadratic 
form  (4.87). 

The  algorithms  are  based  primarily  on  calculations  appearing  in  the  proofs  in 
Section  4.4.1.  The  main  building  blocks  are  as  follows. 
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smooth  function  decomposition  Given  smooth  real-valued  function  /,  return 
smooth  functions  gt,  i  6  n,  such  that  Equation  (4.89)  holds.  This  is  accom¬ 
plished  via  the  integration  in  Equation  (4.92). 


df 

1.  Compute  approximate  partial  derivatives  — — ,  i  E  n. 

UXi 

2.  For  each  point  x  in  the  domain  of  dehnition  of  /,  compute  approximate 


integrals  gi(x)  = 


df_ 

dxi 


( tx )  dt,  i  £  n. 


Morse  function  decomposition  Given  Morse  function  /,  i.e.,  /  has  non-degen¬ 
erate  critical  point  at  0,  return  smooth  functions  ,  i,  j  G  n  such  that 
Equation  (4.93)  holds.  This  is  accomplished  via  n  +  1  smooth  function  de¬ 
compositions. 


1.  Apply  the  smooth  function  decomposition  to  /  yielding  gi:  i  G  n. 

2.  Apply  the  smooth  function  decomposition  to  each  of  the  g%  yielding  hij, 
i,j  e  n. 


matrix  square  root  Given  matrix  B  close  to  the  identity,  return  its  square  root 
(7,  i.e.,  B  =  C2.  The  matrix  B  must  satisfy 


||  1  —  5  ||  <  1  (4.114) 

In  that  case,  the  following  algorithm  converges  to  a  fixed  point  corresponding 
to  the  desired  matrix  C  =  B 1/2. 

Ck+1  =  Ck  +  ^(B-C2k)  k  =  0,1,...  (4.115) 

C0  =  1 

The  convergence  of  the  sequence  {Ck}  to  the  fixed  point  B1'2  can  be  shown 
to  be  a  consequence  of  the  contraction  mapping  principle. 
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Morse-Palais  transformation  Given  Morse  function  /,  return  neighborhood  U, 
coordinate  transformation  (f>,  and  invertible  symmetric  matrix  A  such  that 
Equation  (4.86)  holds. 

1.  Apply  the  Morse  function  decomposition  to  /  yielding  hj3 ,  i,j  G  n.  Let 
H  (. x )  =  [hij  (x)]  and  A  =  H  (0). 

2.  For  each  point  x  in  the  domain  of  definition  of  /: 

(a)  Compute  the  solution  B  of  the  matrix  equation  AB  =  H  (x). 

(b)  If  ||  1  -  B  ||  <  1  then: 

i.  Apply  the  matrix  square  root  algorithm  to  compute  C  =  Biri . 

ii.  Let  (j)  (x)  =  C  x. 

iii.  Include  the  point  x  in  the  neighborhood  U. 

(c)  Otherwise,  the  point  x  is  not  in  the  neighborhood  U  and  no  further 
calculations  apply. 

This  procedure  provides  an  estimate  of  the  neighborhood  U  for  which  the 
function  can  be  transformed  to  the  canonical  quadratic  form.  It  is  possible 
that  the  maximal  neighborhood  is  larger. 

spherical  transformation  Given  transformation  (j)  and  invertible  symmetric  ma¬ 
trix  A  such  that  Equation  (4.86)  holds,  return  index  r  and  coordinate  trans¬ 
formation  ip  such  that  Equation  (4.88)  holds. 

1.  Compute  the  spectral  decomposition  of  matrix  A,  i.e.,  A  =  V  A  VT . 

2.  Let  E  =  diag(|Ai| , . . . ,  |An|). 

3.  Let  r  equal  the  number  of  At  such  that  A,  <  0. 

4.  Let  R  =  EVJ. 
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5.  For  each  point  x  in  the  domain  of  definition  of  /,  let  i/j  ( x )  =  R<f{x). 


Remark  4.4.20  The  terminology  “Morse  function  decomposition”  is  somewhat 
misleading  since  the  decomposition  (4-93)  merely  requires  a  critical  point  that  is 
not  necessarily  non- degenerate.  However,  we  adopt  the  terminology  for  lack  of  a 
better  name  and  because  we  are  applying  the  decomposition  to  Morse  functions.  □ 

4.5  Computing  the  Balancing  Transformation 

A  realization  (/,  g,  h)  for  a  nonlinear  system  is  transformed  to  balanced  form  in 
the  Scherpen  procedure  by  composing  several  local  coordinate  transformations  as 
illustrated  in  Figure  4.2.  The  transformations  are,  in  general,  nonlinear,  and  result 
from  manipulations  on  the  controllability  and  observability  energy  functions.  Each 
transformation  is  a  local  generalization  of  a  corresponding  linear  transformation 
in  the  procedure  for  balancing  LTI  systems. 


A 


u 


u 


Figure  4.2:  Overview  of  coordinate  transformations  for  nonlinear  balancing. 

We  use  the  following  terminology  and  notation  in  describing  the  required  trans¬ 
formations.  Applying  the  Morse-Palais  lemma  to  the  controllability  function  gives 
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the  Morse  coordinate  transformation,  denoted  <J>m  :  U  — >  U,  where  U  and  U  are 
neighborhoods  of  0.  We  also  define  local  transformations  which  bring  the  realiza¬ 
tion  to  forms  analogous  to  input-normal  and  balanced,  denoted  v  :  U  — >  U  and 
77  :  U  — >  U,  respectively,  where  U  and  U  are  neighborhoods  of  0.  The  transforma¬ 
tions  are  written 


®m(x) 

x  e  U,  x  e  U 

(4.116) 

v(x) 

x  e  U,  x  eU 

(4.117) 

rj(x) 

x  e  U,  x  e  U 

(4.118) 

Composing  <3 )m  with  v  1  results  in  the  input-normal  coordinate  transformation 
$/,  i.e., 

x  =  $/(x)  =  x  <G  U,  x  G  U  (4.119) 

Composing  <&m  with  u~1  and  r/^1  results  in  the  balancing  transformation  i.e., 

j  =  $b(i)  =  ie[/,  x  g  U  (4.120) 

The  balanced  realization  (/,  77,  hj  is  given  by 

/(*)  =  [DQsiz^fiQBiz)) 

9i(x)  =  [D^B(z)y1gi  ($B(z))  iem 

h{x)  =  h($B(z)) 

Remark  4.5.1  There  is  an  equivalent  dual  procedure  (see  [140]),  in  which  the 

Morse-Palais  lemma  is  applied  to  the  observability  energy  function,  and  the  in¬ 
termediate  step  takes  the  realization  to  output-normal  (instead  of  input-normal) 
form.  Its  use  in  an  appropriately  revised  balancing  procedure  results  in  an  equiv¬ 
alent  balanced  realization.  The  computational  methods  are  essentially  identical. 
□ 
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In  this  section,  we  provide  the  mathematical  and  computational  frameworks 
for  performing  the  individual  steps  and  combining  them  into  the  overall  balancing 
procedure. 

4.5.1  Morse-Palais  Form 

The  Morse-Palais  lemma  is  applied  in  the  Scherpen  balancing  procedure  by  ob¬ 
serving  that  the  controllability  function  Lc  has  a  non-degenerate  critical  point  at 
0.  Therefore,  there  exists  a  Morse  coordinate  transformation  under  which  Lc  is 
quadratic  on  a  neighborhood  of  0.  Equation  4.88  leads  directly  to  the  following 
result. 

Corollary  4.5.2  (Morse  Coordinate  Transformation  for  Lc  Around  0) 

There  exist  neighborhoods  U  and  U  of  0  and  a  local  coordinate  transformation 

x  — >  x  =  j  4>m (0)  =  0  (4.121) 

such  that 

Lc(x)  =  Lc($m(x))  =  ~xT  x  ief/  (4.122) 

where  U  =  &m(U).  □ 

Example  4.5.3  Consider  the  case  of  a  LTI  system  with  controllability  Gramian 
Wc  and  controllability  function  Lc(x)  =  \  xJ  Wc~l  x.  Let  Wc  =  L  LJ  be  the 
Cholesky  decomposition  for  the  symmetric  positive-definite  matrix  Wc.  Then  the 
Morse  coordinate  transformation  for  Lc  around  0  is  given  by 

x  =  4>m  {x)  =  Lx  (4.123) 

resulting  in  Lc(x)  =  \xT  x.  □ 
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4.5.2  Input-Normal  Form 

Recall  the  input-normal  form  for  a  stable  minimal  linear  system  realization 
(A,  B,  C ),  i.e. ,  Gramians  take  the  form  Wc  =  1  and  Wa  =  E2  =  diag  (erf, . . . ,  cr2). 
Consequently,  the  corresponding  energy  functions  are  given  by 

Lc(x)  =  ]-xTx  (4.124) 

L0(x)  =  -xTS2x  (4.125) 

We  seek  a  local  coordinate  transformation  under  which  the  nonlinear  system  real¬ 
ization  ( f,g,h )  and  corresponding  energy  functions  Lc  and  L0  take  an  analogous 
form  in  a  neighborhood  of  0. 

Assume  that  we  already  have  applied  the  Morse  coordinate  transformation  <3 >m 
guaranteed  by  Corollary  4.5.2.  Let  the  energy  functions  in  the  new  coordinates  be 
denoted  Lc  and  L0.  The  transformed  controllability  function  Lc  is  of  the  desired 
form  (4.124).  We  need  an  additional  change  of  coordinates  under  which  L0  takes 
a  form  analogous  to  (4.125)  while  preserving  the  form  of  Lc.  The  following  is  a 
direct  result  of  Lemma  4.4.12  applied  to  La. 

Corollary  4.5.4  For  each  x  G  U  there  exists  a  matrix  M{x )  such  that 

L0(x)  =  L0($m(x))  =  xT  M(x)  x.  (4.126) 

Moreover,  M(x )  is  symmetric  everywhere  on  U  and 

M(0)  =  D2Lo(0)  (4.127) 

□ 

To  complete  the  input-normal  analogy  we  must  diagonalize  M  throughout  U 
while  preserving  the  form  of  Lc.  The  following  result  from  Kato  [77]  provides 
conditions  for  the  existence  of  such  a  transformation. 
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Lemma  4.5.5  (Kato  [77])  If  the  number  of  distinct  eigenvalues  of  M (. x )  is  con¬ 
stant  on  a  neighborhood  U  of  0,  then  the  eigenvalues  and  eigenvectors  of  M(x)  are 
smooth  functions  of  x  G  U .  □ 

Henceforth,  we  assume  that  M(x)  always  has  a  constant  number  of  distinct 
eigenvalues  on  U.  Since  M(0)  is  symmetric  and  positive-definite,  it  is  diago- 
nalizable.  Thus,  our  assumption  together  with  Lemma  4.5.5  implies  that  M(x) 
is  smoothly  diagonalizable  throughout  U,  i.e.,  there  exist  smooth  matrix- valued 
functions  T  and  A  such  that 

M(x)  =  T(x)  A(x)  T{x)J  xeU  (4.128) 

where  T(x)  is  orthogonal  for  each  x  G  U  and  A(x)  takes  the  form 

A(x)  =  diag  (Ai(x), . . .  Xn(x))  x  G  U  (4.129) 

with  Ai(x)  >  •  •  •  >  X„fx)  >  0  by  convention. 

To  construct  the  input-normal  coordinate  transformation,  we  define  the  change 
of  coordinates 

v  :  U  — y  U  ,  x  — >  x  —  v(x) ,  zz(0)  =  0  (4.130) 

by 

x  =  v(x)  =  T(x)T  x  x  G  U  (4.131) 

where  U  =  v(U).  Observe  that  v  is  linear  and  orthogonal  for  each  fixed  x.  Com¬ 
posing  u  1  with  4>m  yields  the  input-normal  coordinate  transformation 

x  =  $/(x)  =  4>m(^’"1(£))  x  G  U  (4.132) 

where  U  =  $/(£/).  This  is  summarized  in  the  following. 
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Lemma  4.5.6  (Input-Normal  Form)  There  exist  neighborhoods  U  and  U  of  0 
and  a  local  change  of  coordinates 


<f>j  :  U  ->•  U ,  x  ->■  x  =  $/(x) ,  $/(0)  =  0  (4.133) 

such  that 

Lc(x)  =  Lc(<f>/(x))  =  (4.134) 

Z0(x)  =  L0(4>j(x))  =  ^xt1F(x)x  (4.135) 

where 

W(x)  =  diag  (/ii(x), . . . ,  /xn(x))  (4.136) 

Hi(x )  =  Aj(i/_1(5;))  iGn  (4.137) 

Example  4.5.7  the  continue  with  the  LTI  system  of  Example  4.5.3.  Let  the  sys¬ 
tem  have  observability  Gramian  WQ  and  observability  function  L0(x)  =  \xTW0x. 
Let  W0  =  TT?TJ  be  the  spectral  decomposition  for  the  symmetric  positive- definite 
matrix  WQ.  Then  the  input-normal  transformation  is  given  by 

x  =  v{x)  =  TJx  (4.138) 

x  =  $i(x)  =  LTx  (4.139) 

resulting  in  Lc(x )  =  \  xJ  x  and  L0(x )  =  \  xT  E2  x.  □ 


4.5.3  Balanced  Form 


Recall  the  balanced  form  for  a  stable  minimal  linear  system  realization  (4,  B.  C ), 
i.e.,  Gramians  are  such  that  Wc  =  E  =  WQ.  Consequently,  the  corresponding 
energy  functions  are  given  by 


Lc  (x) 
L0(x) 


1  T  yi-1 

-  x  E  x 

2 

-  xJ  TiX 
2 
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(4.140) 

(4.141) 


We  seek  a  local  coordinate  transformation  under  which  the  nonlinear  system  real¬ 
ization  ( f,g,h )  and  corresponding  energy  functions  Lc  and  L0  take  an  analogous 
form  in  a  neighborhood  of  0. 

Assume  that  we  already  have  computed  the  objects  defined  in  Lemma  4.5.6. 
Define  the  change  of  coordinates  (point-dependent  scaling) 

77 :  U  — >■  U  ,  x  —>  x  =  r j(x) ,  77(0)  =  0  (4.142) 


by 

x  =  rj(x)  =  r(x)  x  x  G  U  (4.143) 

where  U  =  rj(U)  and 

T(x)  =  diag  (/ii(x)^, . . . ,  //n(x)*)  (4.144) 

Observe  that  77  is  linear  and  diagonal  for  each  fixed  x.  Composing  rf  1  with 
yields  the  the  balancing  coordinate  transformation 

$B(x)  =  $/(77_1(x))  =  <Lm(W1(77_1(x)))  x  e  U  (4.145) 

where  U  =  <$>b{U).  This  is  summarized  in  the  following. 


Lemma  4.5.8  (Balanced  Form)  There  exist  neighborhoods  U  and  U  of  0  and  a 
local  change  of  coordinates 


$B:U^U,  x^x  =  $B(x),  $b(0)  =  0  (4.146) 

such  that 

Lc(x)  =  Lc($b(x))  =  ^ xTW~\x)x  (4.147) 

L0(x)  =  L0($b(x))  =  \ xTW{x)x  (4.148) 
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where 


(4.149) 

(4.150) 


W(x)  =  diag  (cri(x), . . . ,  crn(x)) 

<7i(x)  =  /ii(?7_1(x))^  i  G  n 

Definition  4.5.9  The  functions  {cti(-),  . . . ,  <rn(-)}  are  called  the  singular  value 
functions  of  the  affine  nonlinear  system.  □ 

Remark  4.5.10  The  terminology  “singular  value  functions”  was  coined  by  Scher- 
pen  in  [lfO,  lfl],  where  they  were  defined,  somewhat  differently,  as 

(Ti{x)  =  /at  (0, . . .  ,0,?]_1(xi),0, . . .  ,0)  /  (4.151) 

The  difference  is  that  the  square  root  of  Hi  is  evaluated  at  points  on  the  i-th 
coordinate  axis.  This  convention  facilitates  some  of  the  subsequent  calculations 
in  [140,  141 ],  but  is  inconsequential  for  our  purposes  here.  □ 

Remark  4.5.11  In  contrast  to  the  LTI  case,  the  singular  value  functions  are  not 
invariant  under  coordinate  transformation.  However,  for  a  LTI  system  they  spe¬ 
cialize  to  the  constant  Hankel  singular  values.  □ 

Example  4.5.12  We  continue  with  the  LTI  system  of  Example  4-5.7.  The  bal¬ 
ancing  transformation  is  given  by 

x  =  r](x)  =  E1/2  x  (4.152) 

x  =  (x)  =  LTYTX^2  x  (4.153) 

resulting  in  Lc{x)  =  \  xT  E_1  x  and  L0{x)  =  \  xT  E  x.  □ 

The  singular  value  functions  G\,...,an  and  the  balancing  transformation  $ ^ 
are  not  unique  for  a  given  realization  (/,  g,  h ).  This  can  be  argued  as  a  consequence 
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of  the  non-uniqueness  of  the  Morse  coordinate  transformation.  Thus,  there  exists 
a  family  of  transformations  &b,  each  producing  a  balanced  realization  (f ,  g.  hj 
from  among  a  family  of  such  balanced  realizations. 

The  model  reduction  properties  of  nonlinear  balancing,  as  with  the  LTI  case, 
reside  in  a  ranking  of  the  singular  value  functions,  i.e.,  the  magnitude  of  crj(x) 
relative  to  the  others  is  an  indication  of  the  degree  to  which  the  i-th  state  compo¬ 
nent  contributes  to  the  input-to-output  energy  gain  of  the  system.  Since,  in  the 
nonlinear  setting,  the  eq  are  functions  of  the  state  x,  we  must  be  concerned  with 
the  neighborhood  of  0  in  which  the  functions  do  not  intersect,  i.e.,  switch  places 
in  the  ranking.  Furthermore,  since  they  are  not  unique,  there  is  the  question  of 
whether  different  collections  of  cq  for  (f,g,h)  will  result  in  different  orderings  by 
magnitude.  We  are  not  aware  of  any  results  addressing  these  issues. 

4.5.4  Computation 

In  order  to  implement  the  nonlinear  balancing  procedure,  we  compute  discretized 
approximations  of  the  various  functions  and  local  coordinate  transformations  as 
described  in  the  previous  sections.  Figure  4.3  illustrates  the  computational  pro¬ 
cedure.  The  inputs  are  the  smooth  functions  /,  g ,  and  h  in  realization  ( f,g,h ) 
and  a  suitable  state-space  grid  X .  The  outputs  are  discretized  approximations 
of  the  functions  /,  g,  and  h  in  balanced  realization  (/,  g ,  h'j  and  neighborhoods 
of  grid  points  U  and  U  representing  the  neighborhoods  on  which  the  balancing 
transformation  is  defined.  In  this  section  we  present  the  computational  framework 
and  the  main  algorithms  for  performing  the  required  computations. 
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Figure  4.3:  Overview  of  computational  procedure  for  nonlinear  balancing. 
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Computational  Setting 


In  the  computational  setting,  functions  are  evaluated  at  a  pre-determined  set  of 
points  on  a  state-space  grid,  i.e.,  they  are  discretized  approximations.  A  neigh¬ 
borhood  of  0  corresponds  to  a  set  of  discrete  grid  points  containing  the  point 
representing  the  origin.  We  do  not  address  the  problem  of  determining  an  ap¬ 
propriate  discretization  of  the  state-space.  Rather,  we  assume  that  a  grid,  i.e.,  a 
collection  of  points  denoted  A,  of  sufficient  resolution  has  been  constructed.  Sup¬ 
pose  that  we  have  discretized  the  state-space  in  such  a  way  that  there  are  p  evenly 
spaced  grid  points  along  each  of  the  n  dimensions.  This  means  that  there  are  pn 
total  discrete  points  in  the  state-space  grid. 

It  is  normally  the  case  that  the  basic  computational  primitives  of  an  algorithm 
are  elementary  operations  such  as  floating  point  additions,  multiplications,  and  so 
forth.  Such  a  low-level  viewpoint  is  unsuitable  for  our  purposes  here.  Instead,  we 
consider  the  primitives  to  be 

•  standard  matrix-vector  operations  such  as  matrix  multiplication,  matrix  in¬ 
version,  matrix  transposition,  and  spectral  decomposition;  and 

•  standard  operations  on  a  function  of  a  real  variable  such  as  definite  integra¬ 
tion,  partial  differentiation,  and  multi-dimensional  interpolation. 

There  exist  standard  algorithms  for  performing  the  above  operations,  each  of 
which  has  an  associated  computational  complexity  (e.g.,  o(m3)  elementary  oper¬ 
ations  for  multiplication  of  m  x  m  matrices).  However,  for  the  balancing  algo¬ 
rithms,  the  overall  computational  complexity  is  dominated  by  the  system  dimen¬ 
sion  as  manifested  in  the  grid  resolution,  i.e.,  the  number  of  grid  points  at  which 
computations  are  performed.  For  example,  we  use  an  algorithm  called  SPECTRAL- 
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Decomp  that  takes  a  matrix-valued  function,  and  returns  the  eigenvalues  and 
eigenvectors  (performs  a  spectral  decomposition)  at  each  point  in  the  state-space 
grid.  Thus,  if  the  spectral  decomposition  algorithm  has  complexity  o(s(m)),  the 
algorithm  SPECTRAL-DECOMP  has  complexity  o(s(m)pn),  i.e.,  is  exponential  in 
n.  Point  dependency  dominates  the  computational  complexity  of  the  nonlinear 
balancing  algorithms. 

We  adopt  a  point-wise  data  structure  for  storage  of  the  objects  of  interest  (e.g., 
grid  points,  controllability  function).  Although  we  do  not  program  the  balancing 
procedure  using  a  database  per  se,  it  is  a  useful  model  for  illustration.  Consider 
a  database,  i.e.,  a  collection  of  data  records,  each  of  which  corresponds  to  a  single 
point  in  the  state-space  grid.  Each  data  record  contains  the  value  of  each  of  the 
objects  of  interest  at  one  particular  grid  point.  Thus,  a  function  is  represented  by 
one  held  of  the  entire  database.  The  database  structure  is  given  in  Table  4.1. 

It  is  useful  to  define  the  inverses  of  the  coordinate  transformations  Tm,  $7, 
and  by 

=  1  'h/  =  $/  1  T  b  —  1  (4.154) 

on  &b(U),  and  &b(U),  respectively.  The  inverse  transformations  are 

represented  in  the  data  structure,  respectively,  by  the  x,  x,  and  x  data  elements 
corresponding  to  grid  point  x. 

Balanced  Realization 

The  algorithm  for  computing  a  balanced  realization  is  as  follows. 

Algorithm  4.5.13  (Balanced  Realization) 

Balance (/,  #,  h,  X) 

1  Y  t-  Sde-Sim(/,  g,  X) 

2  Lc  t-  Ctrb-Fn- Monte- Carlo (Y,  X) 
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Data  Record  Structure 


Lc 

L0 

U 

U 

M 

T 

Ai 

^ n 

u 

a 

u 

Data  Fields 


Data  Element 

Description 

Type 

X 

Grid  Point  in  Standard  Coordinates 

n-vector 

Lc 

Value  of  Controllability  Function 

scalar 

L0 

Value  of  Observability  Function 

scalar 

X 

Grid  Point  in  Morse  Coordinates  for  Lc 

n-vector 

U 

Neighborhood  Membership  Indicator 

boolean 

u 

Neighborhood  Membership  Indicator 

boolean 

M 

Intermediate  Matrix  for  L0 

n  x  n  matrix 

T 

Eigenvector  Matrix  for  M 

n  x  n  matrix 

Ai 

Largest  Eigenvalue  for  M 

scalar 

^ n 

Smallest  Eigenvalue  for  M 

scalar 

X 

Grid  Point  in  Input-Normal  Coordinates 

n-vector 

U 

Neighborhood  Membership  Indicator 

boolean 

r 

Scaling  Matrix 

n  x  n  matrix 

X 

Grid  Point  in  Balanced  Coordinates 

n-vector 

U 

Neighborhood  Membership  Indicator 

boolean 

Vi 

Largest  Singular  Value 

scalar 

Smallest  Singular  Value 

scalar 

Table  4.1:  Database  structure  containing  data  elements  for  nonlinear  balancing 
computational  procedure.  For  a  state-space  grid  with  pn  points,  there  are  p"  data 
records,  each  corresponding  to  a  grid  point. 
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3  L0  y-  Obsv-Fn(/,  h,  X) 

4  U j),A,U,U )  «-  Morse- Palais (Lc,  X) 

5  4>m  u-  Spherical(0,  A,  U) 

6  L0  y-  Transform(Lg,  4>m,  U) 

7  M  4r-  Morse-Fn-Decomp(L0,  17) 

8  (T,  A)  y-  Spectral-Decomp (M,  IT) 

9  «-  Input- Normal- Trans (T,  4>M,  F) 

10  (vB,u)  <-  Balancing-Trans(A,  4>/,  f/) 

11  (/,  g,  h)  <-  Transform(/,  g,  h,  VB,  U) 

12  Return  /,  g,h,U  □ 

We  now  present  the  computational  methods  and  algorithms  corresponding  to 
the  individual  steps  of  Algorithm  4.5.13.  Some  of  these  have  been  addressed  pre¬ 
viously,  although  not  within  the  framework  of  our  computational  setting. 

Controllability  Function 

Methods  for  computing  the  controllability  function  have  been  described  in  Sec¬ 
tions  4.2  and  4.3.  In  cases  where  we  can  derive  an  exact  expression  for  Lc,  e.g.,  via 
Theorem  4.3.14,  the  task  is  completed  by  discretizing  the  resulting  function  Lc. 
Otherwise,  we  use  the  Monte-Carlo  approach,  yielding  the  approximation  (4.39), 
which  improves  as  the  number  of  experiments  increases.  We  have  developed  ver¬ 
sions  of  Sde-Sim  and  Ctrb-Fn- Monte- Carlo,  respectively,  for  numerical  inte¬ 
gration  of  SDEs  and  computation  of  the  controllability  function  from  the  Monte- 
Carlo  data.  Numerical  schemes  for  simulation  of  SDEs  appear  in  Appendix  C.  We 
developed  and  used  various  multidimensional  histogramming  utilities  to  implement 
the  Monte-Carlo  approach. 
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Observability  Function 


We  have  not  addressed  computation  of  the  observability  function  in  detail.  We 
implement  Obsv-Fn  to  compute  the  observability  function  via  numerical  integra¬ 
tion  of  the  natural  response  of  the  system  and  numerical  integration  of  the  output 
energy.  The  procedure  is  performed  for  each  point  on  the  state-space  grid. 

Algorithm  4.5.14  (Observability  Function) 

Obsv-Fn (/,  h,  X) 

1  for  each  point  x  in  the  state-space  grid  X 

2  xq  4—  x 

3  u  4—  0 

4  z  4-  Ode-Sim(/,  u,  xq) 

5  y  4-  Compose^,  z) 

6  T  4—  a  number  large  enough  so  that  the  natural  response  of  the  stable 
system  is  nearly  zero  for  t  >  T 

7  L0[x\  4-  Integral (y,0,T) 

8  Return  Ld  □ 


Morse  Coordinate  Transformation 

The  algorithms  for  Morse- Palais  and  Matrix- Square- Root  are  based  on  the 
computational  procedures  presented  in  Section  4.4.3. 

Algorithm  4.5.15  (Morse  Coordinate  Transformation) 

Morse-  Palais  (/,  X) 

1  H  4-  Morse-Fn-Decomp(/,  X) 

2  A  4-  Origin-Select(W,  X) 

3  for  each  point  x  in  the  state-space  grid  X 

4  A\  H[x\ 

5  if  Matrix-Norm(J  -  B)  <  1 

6  C  4-  Matrix-Square-Root(R) 

7  z  4—  C  *  x 

8  <f>[x]  t—  z 

9  add  x  to  the  list  of  points  in  neighborhood  U 

10  add  z  to  the  list  of  points  in  neighborhood  U 

11  Return  <f>,A,U,U  □ 
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Remark  4.5.16  The  expression  B  4—  A  \  H[x\  is  equivalent  to  solving  H(x )  = 
A  B  for  B.  □ 


The  algorithm  for  Matrix- Square- Root  is  as  follows. 

Algorithm  4.5.17  (Matrix  Square  Root) 

Matrix-  Square-  Root(R) 

1  if  Matrix- Norm  (I  —  B)  <  1 

2  C  4—  Cprev  4—  II 

3  S  4-  e  +  1 

4  while  8  >  e 

5  C  4—  C  +  0.5  *  (B  —  C2) 

6  8  <-  Matrix-Norm^  -  C'prev) 

7  C'prev  t—  C 

8  else 

9  error:  algorithm  will  not  converge 

10  Return  C  □ 

Remark  4.5.18  The  parameter  e  represents  an  error  tolerance.  It  would  be  set 
as  a  global  constant  or  in  some  other  appropriate  manner.  □ 


Algorithm  4.5.15  returns  discretized  versions  of  the  objects  that  appear  in  The¬ 
orem  4.4.7,  i.e.,  the  coordinate  transformation  0,  the  invertible  symmetric  matrix 
A,  and  the  neighborhoods  U  and  U  of  0.  Computation  of  \I >m  from  0  and  A 
requires  several  additional  straightforward  steps  including  a  standard  spectral  de¬ 
composition  of  A. 

Algorithm  4.5.19  (Spherical  Quadratic  Form) 

Spherical(0,  A,  U ) 

1  (V,  A)  4-  Eig(A) 

2  E  4-  Abs(A) 

3  R4r-  E  *  Transpose  (V) 

4  r  4—  the  number  of  negative  entries  on  the  diagonal  of  A 

5  for  each  point  x  in  the  collection  of  grid  points  U 

6  ip[x]  4—  R  *  <f>[x\ 

7  Return^,?"  □ 
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Remark  4.5.20  When  dealing  with  positive  functions  such  as  Lc  and  L0,  the  index 
r  is  always  zero  so  we  ignore  that  parameter  in  the  balancing  procedure.  □ 


Function  Decompositions 

Algorithms  4.5.13  and  4.5.15  use  the  following  algorithms  for  approximating  the 
decompositions  (4.89)  and  (4.93).  Also,  the  decomposition  La  =  xJ  M(x)x  ap¬ 
pears  in  the  overall  balancing  procedure. 

Algorithm  4.5.21  (Smooth  Function  Decomposition) 

Smooth-Fn-Decomp(/,  U ) 

1  for  i  =  1  to  n 

2  partial f[i]  <-  Partial-Deriv(/,  i,  U) 

3  for  each  point  x  in  the  collection  of  grid  points  U 

4  for  i  =  1  to  n 

5  #[i][x]  <—  Integral (partzai f[i\ ,  0,  x) 

6  G  <-  Vector(#[1],  . . .  ,g[n]) 

7  Return  G  □ 

Algorithm  4.5.22  (Morse  Function  Decomposition) 

Morse-Fn-Decomp(/,  U) 

1  Gf-  Smooth-Fn-Decomp(/,  U ) 

2  for  i  =  1  to  n 

3  F[i]  <r-  Smooth-Fn-Decomp(G[z],  U) 

4  for  j  =  1  to  n 

5  h[i,j]  <-  F[i][j] 

6  H  ■<—  Matrix(/i[1,  1], . . . ,  h[n,  n]) 

7  Return  H  □ 


Input-Normal  and  Balancing  Coordinate  Transformations 

The  algorithms  for  computing  discretized  approximations  of  the  input-normal  and 
balancing  transformations  are  based  on  the  computational  procedures  described  in 
Sections  4.5.2  and  4.5.3. 
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Algorithm  4.5.23  (Input-Normal  Coordinate  Transformation) 

Input- Normal- Trans  (T,  £,  U) 

1  for  each  point  x  in  the  collection  of  grid  points  U 

2  y  «-  f  [a] 

3  z  TRANSPOSE(T[y])  *  y 

4  V'N  t—  -2 

5  add  z  to  the  list  of  points  in  neighborhood  U 

6  Return  ^,U  □ 

Algorithm  4.5.24  (Balancing  Coordinate  Transformation) 

Balancing-Trans(A,  £,  U) 

1  for  each  point  x  in  the  collection  of  grid  points  U 

2  y  «-  f  [a] 

3  Ef-  Square- Root(A,  U) 

4  r  <-  Square- Root(X,  U) 

5  z<r-  r[y]  *  y 

6  ip[x]  4—  z 

7  add  z  to  the  list  of  points  in  neighborhood  U 

8  Return  -0,  U 

□ 


MATLAB  Toolbox 

We  have  implemented  the  algorithms  described  in  this  section  using  the  MAT- 
LAB  [102]  programming  environment.  The  resulting  collection  of  programs  and 
utilities  for  performing  various  operations  on  multidimensional  discretized  func¬ 
tions,  referred  to  as  the  nonlinear  balancing  toolbox,  was  used  as  the  computational 
tool  to  apply  nonlinear  balancing  to  the  examples  in  Section  4.6. 

We  performed  simulations  on  a  Sun  Sparc  Ultra-10  running  the  UNIX  operating 
system.  Running  times  for  the  various  programs  depend  on  grid  resolution  and 
system  dimension.  Roughly,  the  time  required  to  compute  a  Morse  transformation 
for  systems  of  dimension  2,  3,  and  4  is  on  the  order  of,  respectively,  seconds 
to  minutes,  minutes  to  hours,  and  hours  to  days.  Computations  for  systems  of 
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dimension  5  and  higher  are  currently  infeasible. 

It  is  possible  to  increase  the  speed  of  computation  by  converting  the  MATLAB 
code  to  C,  using  a  faster  processor,  and  taking  advantage  of  opportunities  for 
parallelization  and  other  economies.  However,  we  did  not  pursue  these  options, 
since  it  is  unlikely  that  the  feasible  dimension  would  increase  significantly.  Rather, 
we  believe  that  new  algorithms  will  be  required  for  working  with  higher  dimensional 
systems. 

Utilities 

The  algorithms  presented  in  this  section  use  various  utilities  for  performing  com¬ 
putations  with  multidimensional  discretized  functions,  standard  vector-matrix  op¬ 
erations,  operations  on  a  function  of  a  real  variable,  and  so  forth.  We  briefly 
describe  their  purposes  here  so  that  the  previous  algorithms  can  be  understood. 
The  actual  implementations  in  the  nonlinear  balancing  toolbox  do  not  necessarily 
reflect  exactly  the  descriptions  given  below,  nor  is  this  list  a  complete  compilation 
of  toolbox  utilities  (e.g.,  Transform  and  Integral  require  the  use  of  additional 
multidimensional  interpolation  utilities).  Versions  of  some  of  these  utilities  are 
included  as  standard  functionality  in  MATLAB.  By  a  point-dependent  matrix  we 
mean  a  matrix-valued  function  on  a  grid. 

SPECTRAL- DECOMP  returns  the  point-dependent  eigenvectors  and  eigenvalues  of 
a  point-dependent  matrix 

SQUARE-ROOT  returns  a  point-dependent  matrix  whose  entries  are  the  square 
roots  of  the  respective  entries  of  a  point-dependent  matrix. 

TRANSFORM  takes  one  or  more  discretized  mappings  and  returns  its  (their)  values 
at  the  grid  points  after  a  coordinate  transformation. 
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INTEGRAL  returns  the  approximate  definite  integral  of  a  discretized  function. 


Ode- Sim  returns  the  sampled  time  evolution  of  a  forced  ODE  given  initial  condi¬ 
tions  and  sampled  input  signal. 

Partial- Deriv  returns  an  approximation  to  the  i-t.li  partial  derivative  of  a  dis¬ 
cretized  function. 

COMPOSE  returns  a  discretized  function  that  represents  the  composition  of  two 
other  discretized  functions. 

ORIGIN- SELECT  returns  the  indices  of  the  grid  point  that  represents  the  origin  in 
state-space. 

ElG  returns  the  eigenvalues  and  unit  eigenvectors  of  an  invertible  matrix. 

Matrix- Norm  returns  the  largest  singular  value  of  a  matrix. 

Abs  returns  a  matrix  whose  entries  are  the  absolute  values  of  the  entries  of  the 
original  matrix. 

TRANSPOSE  returns  the  transpose  of  a  matrix. 

Vector  assembles  a  collection  of  numbers  or  discretized  functions  into  a  vector. 

MATRIX  assembles  a  collection  of  numbers  or  discretized  functions  into  a  matrix. 


4.6  Applications 

In  this  section  we  illustrate  the  methods  and  algorithms  presented  in  this  Chapter 
by  applying  them  to  two  examples  of  rigid  link  mechanical  systems.  We  compute  a 
balanced  realization  for  a  forced  damped  pendulum  system,  and  take  steps  toward 
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balancing  a  forced  damped  double  pendulum  system.  The  material  in  this  section 
relies  heavily  on  the  mathematical  framework  for  mechanical  systems  as  presented 
in  Section  2.7. 

4.6.1  A  Balanced  Realization  for  the  Forced  Damped  Pen¬ 
dulum 

The  first  example  that  we  consider  is  a  simple  pendulum  system  as  illustrated 
in  Figure  4.4.  The  system  incorporates  linear  torsional  damping,  linear  torsional 
stiffness,  and  a  torque  input  at  the  rotary  joint.  We  assume  that  the  shaft  is 
massless  and  that  the  pendulum  moves  only  in  the  plane.  We  consider  two  cases 
for  the  system  output:  where  the  joint  angle  is  measured  (position  read-out)  and 
where  the  joint  angular  velocity  is  measured  (velocity  read-out).  Figure  4.4  also 
provides  values  for  each  of  the  physical  parameters  that  we  use  in  numerical  studies. 
Simulations  were  performed  using  routines  from  the  nonlinear  balancing  toolbox 
described  in  Section  4.5.4. 

It  is  beneficial  to  study  the  pendulum  as  an  example  because 

•  it  is  nearly  linear,  so  we  can  use  the  LTI  theory  to  obtain  a  good  estimate  of 
the  correct  results  for  comparison;  and 

•  in  previous  sections  we  have  studied  second-order  mechanical  systems  and 
obtained  an  exact  formula  for  the  controllability  function. 

State-Space  Realization 

We  obtain  a  state-space  realization  (/,  g,  h )  for  the  pendulum  system  via  the  Euler- 
Lagrangian  mechanics  outlined  in  Section  2.7.  Let  the  generalized  position  q  and 


177 


e 

joint  angle  (between  shaft  and  vertical) 

T 

torque  applied  at  rotary  joint 

m 

1/40 

mass  attached  to  end  of  shaft 

b 

2 

torsional  damping  coefficient  (friction) 

k 

1 

torsional  stiffness  coefficient  (spring  constant) 

L 

20 

length  of  shaft 

G 

10 

gravitational  acceleration 

Figure  4.4:  Planar  pendulum  system  with  massless  shaft,  linear  torsional  damping, 
linear  torsional  stiffness,  and  torque  input  applied  at  the  rotary  joint.  Values  of 
parameters  are  provided  for  the  numerical  studies  that  we  conducted. 


velocity  q  correspond  to  the  joint  angle  6  and  angular  velocity  6 ,  respectively.  Let 
the  generalized  force  F  represent  the  applied  joint  torque  r.  The  kinetic,  potential, 
and  dissipation  energies  are  given,  respectively,  by 


K  (q,  q)  = 

1  T  2  -2 
—  mL  q 

2 

(4.155) 

U(q,q )  = 

-kq2  —  mG  L  cos  (q) 

(4.156) 

R(q,q)  = 

1  ,  -2 

(4.157) 

The  Lagrangian  L  is  given  by 


L(q,q) 


I<  (q,  q)-U  (q,  q) 

-  m  L2  q2 - kq2  +  mG  L  cos  (q) 

2  2 


(4.158) 
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We  apply  the  Euler-Lagrange  equation  of  motion  (2.113),  i.e., 

d  dL  8L  d R 

dt  dq  dq  dq 

to  obtain  the  equation  of  motion  for  the  pendulum  system,  given  by 


(4.159) 


mL2q  +  kq  +  mGL  sin  (q)  =  F  —  bq 


(4.160) 


The  affine  nonlinear  control  system  is  realized  in  coordinates  x  =  (xi,X2)  = 

(?>  q)  by 


f(%)  = 


x2 


G  .  ,  .  k  b 

sin  (x1 ) - —  X\ - —  x2 

L  m  Lz  m  L 


9{x)  = 


0 


1 


(4.161) 


mL2  J 

and  either  h  (. x )  =  X\  or  h  (x)  =  x2  depending  on  whether  we  measure  angular 
position  or  velocity. 


Remark  4.6.1  We  need  not  explicitly  realize  the  system  in  Hamiltonian  coordi¬ 
nates  (generalized  positions  and  momenta).  The  results  in  Section  4-3.2  still  apply, 
taking  into  account  the  different  coordinates.  □ 


System  Properties 

The  linearization  ( A ,  B,  C )  of  (/,  g ,  h)  is  given  by 


0 

1 

0 

A  = 

G 

k 

b 

B  = 

1 

m  L 2 

mL2  - 

-  mL2  - 

(4.162) 


and  either  C  —  [1  0]  for  position  read-out  or  C  =  [0  1]  for  velocity  read-out. 

Observe  that  the  linearization  is  asymptotically  stable  since 

1 


spec (A)  = 


{2  mL2  (~ft±  Vb2  -4m2L30-4mL2A;)|  €  flT  (4.163) 
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Furthermore,  the  linearization  is  controllable  and  observable  in  both  output  cases 


and  either 


rank  [B  AB ]  =  rank 


m  L 2 

1  -b 

m  L2  m2  L4 


(4.164) 


rank  =  rank 

CA  0  1 


(4.165) 


=  rank 


(4.166) 


[  CA  J  G _ -b 

L  mL2  mL2  . 

Controllability  and  observability  of  the  linearization,  together  with  asymptotic 
stability,  are  sufficient  to  guarantee  local  asymptotic  reachability  and  zero-state 
observability  of  the  nonlinear  system  (see,  e.g.,  [121]).  Therefore,  we  can  conclude 
that  the  controllability  and  observability  functions  exist  in  a  neighborhood  of  0 
and  are  non-degenerate. 

Substituting  the  values  of  the  parameters  given  in  Figure  4.4  gives  the  linearized 


realization 


(4.167) 


-0.6  -0.2 


and  either  C  =  [1  0]  or  C  =  [0  1].  The  Gramians  and  Hankel  singular  values 

are  given,  for  the  position  and  velocity  output  cases,  respectively,  by 


Wr  = 


0.0417  0.0000 


0.0000  0.0250 


Wo  = 


1.5000  0.0000 


0.0000  2.5000 


(4.168) 
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<y i  =  0.3671  and  cr2  =  0.2838;  and 


0.0417 

0.0000 

0.3671 

0.0000 

Wo  = 

0.0000 

0.0250 

0.0000 

0.2838 

(T i  =  0.2500  and  o2  =  0.2500. 


(4.169) 


Controllability  Function 

The  Hamiltonian  H  for  the  pendnlnm  system  is  given  by 
H{x)  =  K(x)  +  U(x) 

=  ^mL2x\  +  ^  kx\  —  mGLcos (xi)  (4.170) 

Furthermore,  the  equipartition  of  energy  condition  (4.65)  is  satisfied  trivially  for 
the  1-DOF  system  with  ratio  i  =  b.  Applying  Theorem  4.3.14,  the  controllability 
function  Lc  is  given,  exactly,  by 

Lc  ( x )  =  —2  bm  G  L  cos  (xi)  +  bk  x\  +  bm  L2  x\  +  2b  mG  L  (4.171) 

and  with  substituted  values  by 

Lc  (x)  =  —20  cos  (xi)  +  2  x\  +  20  x\  +  20  (4.172) 

Since  we  have  an  exact  formula  for  Lc,  we  can  study  the  performance  of  the 
Monte-Carlo  approach  by  comparing  Lc  with  an  approximation  computed  via 
Equation  (4.39)  using  an  approximate  stationary  density.  To  this  end,  we  sim¬ 
ulated  50,000  sample  paths  for  the  pendulum  system  with  approximate  Gaussian 
white  noise  injected  as  the  torque  input.  We  assumed  that  steady-state  was  reached 
after  60  time  units,  6  times  the  largest  time  constant  of  the  system. 

The  results  of  the  Monte-Carlo  experiments  are  presented  in  Figure  4.5.  We 
generated  histograms  for  two  grid  resolutions:  Ax  =  0.1  (coarse)  and  Ax  =  0.05 
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(fine).  The  accuracy  of  computations  such  as  approximate  differentiation  and  in¬ 
terpolation  improves  as  grid  resolution  increases,  i.e.,  becomes  finer.  However,  it 
is  crucial  to  obtain  an  approximation  of  Lc  that  is  reasonably  smooth  and  con¬ 
sequently  has  no  local  minima  or  maxima  other  than  at  0.  Smoothness  of  the 
approximation  improves  as  the  grid  resolution  becomes  coarser.  In  this  case,  we 
use  the  coarse  grid,  which  is  roughly  the  highest  resolution  that  provides  a  smooth 
approximation.  By  generating  additional  sample  paths,  we  can  increase  the  grid 
resolution  while  maintaining  a  smooth  approximation. 

We  investigate  the  performance  of  the  Monte-Carlo  approach  by  comparing 
the  approximate  Lc  with  the  exact  Lc  given  by  (4.171).  Moreover,  we  check  to 
see  if  Lc  and  the  approximation  satisfy  the  HJB  equation  (4.14).  This  is  done  by 
computing  and  plotting  the  HJB  residual ,  i.e.,  the  right  hand  side  of  (4.14),  given 
by 

F)T  i  Fit  \ Fit  1  t 

Pc  Or)  =  -q£  0*0  /  0*0  +  2  ~di  (*)  9  0*0  gT  0*0  0*0]  (4-173) 

The  results  are  shown,  for  low  and  high  resolution  grids,  respectively,  in  Figures  4.6 
and  4.7.  The  large  fluctuations  in  the  residuals  at  the  edges  of  the  grid  are  due 
to  numerical  errors  in  computing  derivatives  at  the  edges  and  should  be  ignored. 
The  residual  is  exactly  zero  everywhere  on  the  grid  for  the  exact  controllability 
function,  thus  confirming  that  it  exactly  satisfies  the  HJB  equation.  The  residual 
for  the  approximate  controllability  function  fluctuates  somewhat  (more  on  the  finer 
grid)  but  remains  relatively  close  to  zero  at  all  grid  points.  The  approximation  is 
better  at  points  close  to  the  origin,  since  a  region  close  to  the  origin  contains  most 
of  the  Monte-Carlo  data.  The  performance  of  the  Monte-Carlo  approach  and  its 
numerical  implementation  in  the  nonlinear  balancing  toolbox  appear  to  be  good 
for  this  example. 
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Stochastically  Excited  Pendulum:  Stationary  Density 


Pendulum:  Approximate  Controllability  Function 


Figure  4.5:  The  stationary  density  and  derived  approximate  controllability  func¬ 
tion  for  the  pendulum  system.  Monte-Carlo  approach  used  50,000  sample  paths 
for  white  noise  driven  system.  Top  left:  approximate  stationary  density  (coarse 
grid);  Top  right:  approximate  controllability  function  (coarse  grid);  Bottom  left: 
approximate  stationary  density  (fine  grid);  Bottom  right:  approximate  controlla¬ 
bility  function  (fine  grid). 
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L-(x)  L  (x) 


Pendulum:  Approximate  Controllability  Function  Pendulum:  HUB  PDE  Residual  ter  Approx  ctrb  Fn 


Figure  4.6:  The  controllability  function  and  HJB  residual  for  the  pendulum  system 
(low  resolution  grid).  Top:  Approximate  controllability  function  (Monte-Carlo) 
and  HJB  residual;  Bottom:  Exact  controllability  function  and  HJB  residual. 
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L.M  L  (x) 


Pendulum:  Approximate  Controllability  Function  Pendulum:  HUB  PDE  Residual  for  Approx  Ctrb  Fn 


Figure  4.7:  The  controllability  function  and  HJB  residual  for  the  pendulum  system 
(high  resolution  grid).  Top:  Approximate  controllability  function  (Monte-Carlo) 
and  HJB  residual;  Bottom:  Exact  controllability  function  and  HJB  residual. 
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Observability  Function 


In  the  case  where  we  measure  angular  velocity,  i.e. ,  h(x)  =  x2,  we  can  easily  solve 
the  Lyapunov-type  PDE  (4.15)  satisbed  by  L0  to  give  an  exact  formula  for  the 
observability  function.  We  obtain  an  expression  for  La,  given  by 


L0  (x)  = 


mG  L 
2b 


k 


m  L2 


mG  L 


coS(x1)  +  f-bxl  +  —xi+  2b 


(4.174) 


and  with  substituted  values  by 


L0  (, x )  =  —1.25  cos  (xi)  +  0.125  x\  +  1.25  x\  +  1.25 


(4.175) 


In  the  case  where  we  measure  the  joint  angle,  i.e.,  h{x)  =  Xi,  we  cannot  obtain 
a  closed  form  solution  of  (4.15).  Instead,  we  use  Algorithm  4.5.14  to  compute  an 
approximation.  In  addition,  to  study  the  performance  of  the  algorithm,  we  also 
use  Algorithm  4.5.14  to  compute  an  approximation  in  the  case  where  h(x)  =  X2- 
Again,  for  purposes  of  simulation,  we  assumed  that  steady-state  was  reached  after 
60  time  units,  6  times  the  largest  time  constant  of  the  system. 

We  investigate  the  performance  of  the  algorithm  by  comparing  the  approximate 
L0  with  the  exact  La  (velocity  output  case)  given  by  (4.174).  Moreover,  we  compute 
and  plot  the  Lyapunov  residual,  i.e.,  the  right  hand  side  of  (4.15),  given  by 

r)T  1 

P°  {x)  =  (x)  f(x)  +  -  hT  (x)  h  (x)  (4.176) 

The  results  are  shown  for  the  cases  of  velocity  and  position  read-out,  respec¬ 
tively,  in  Figures  4.8  and  4.9.  As  before,  the  large  huctuations  in  the  residuals 
at  the  edges  of  the  grid  are  due  to  numerical  errors  in  computing  derivatives  at 
the  edges  and  should  be  ignored.  All  residuals  are  zero  or  nearly  zero  at  all  grid 
points.  Moreover,  there  is  negligible  difference  between  the  exact  observability 
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Pendulum  (Velocity  Output):  Approximate  Observability  Function 


Pendulum  (Velocity  Output):  Lyapunov  PDE  Residual  for  Approx  Obsv  Fn 


Pendulum  (Velocity  Output):  Exact  Observability  Function 


Pendulum  (Velocity  Output):  Lyapunov  PDE  Residual  for  Exact  Obsv  Fn 


Figure  4.8:  The  observability  function  and  Lyapunov  residual  for  the  pendulum 
system  with  velocity  output.  Top:  Approximate  observability  function;  Bottom: 
Exact  observability  function. 


function  computed  via  (4.174)  and  the  approximate  observability  function  com¬ 
puted  using  Algorithm  4.5.14.  The  performance  of  the  algorithm  and  its  numeri¬ 
cal  implementation  in  the  nonlinear  balancing  toolbox  appear  to  be  good  for  this 
example. 


Balanced  Realization 

We  now  use  the  previously  computed  Lc  and  La,  and  the  algorithms  presented 
in  Section  4.5.4  and  implemented  in  the  nonlinear  balancing  toolbox,  to  compute 
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Pendulum  (Position  Output):  Approximate  Observability  Function 


Pendulum  (Position  Output):  Lyapunov  PDE  Residual  for  Approx  Obsv  Fn 


Figure  4.9:  The  observability  function  and  Lyapunov  residual  for  the  pendulum 
system  with  position  output. 

balanced  realizations  for  both  pendulum  systems,  i.e.,  with  position  and  velocity 
read-outs. 

As  a  minor  digression,  we  note  that  it  is  possible  to  calculate  an  expression  to 
approximate  the  inverse  Morse  coordinate  transformation  of 

Lc  ( x )  =  —20  cos  (xi)  +  2  x\  +  20  x\  +  20  (4.177) 

around  its  non-degenerate  critical  point  at  0.  Expanding  the  cosine  function  yields 

Lc  ( x )  =  12  x\  —  ^  x\  +  o(x®)  +  20  x\  =  xT  H(x )  x  (4.178) 

6 

where 

12  —  ^  x\  +  o(xf)  0 
0  20 

Thus,  the  inverse  Morse  coordinate  transformation  is  given,  for 
|(10/72)  xf  +  o(xf)  |  <  1,  by 

V12  -  —  x\  +  o(x\)  j  Xi 

\/20  X2 


(4.180) 
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Although  we  have  calculated  an  expression  to  approximate  the  inverse  Morse 
coordinate  transformation,  and  an  expression  for  its  region  of  validity,  we  use 
Algorithm  4.5.15  within  the  overall  balancing  procedure  to  compute  a  balanced 
realization.  Applying  the  remainder  of  the  steps  in  Algorithm  4.5.13  produces 
discretized  approximations  of  the  singular  value  functions  cri(x)  and  cr2(x)  and  the 
functions  /,  g,  and  h  in  the  balanced  realization  (/,  g,  h'j . 

The  computed  singular  value  functions  for  the  pendulum  systems  with  position 
and  velocity  outputs,  respectively,  are  shown  in  Figures  4.10  and  4.11.  Because 
the  pendulum  system  is  nearly  linear,  we  expect  the  singular  value  functions  to 
be  nearly  constant  at  the  value  of  the  corresponding  Hankel  singular  values  of  the 
linearized  system.  This  is  reflected  in  the  computations. 

In  the  case  of  measured  joint  angle,  the  singular  value  functions  are  nearly 
constant  at  grid  points  close  to  the  origin,  taking  values  close  to  0.367  and  0.284. 
This  closely  matches  the  Hankel  singular  values  of  the  linearized  system.  One  state 
is  roughly  1.3  times  as  important  to  the  input-to-output  behavior  of  the  system. 

In  the  case  of  measured  angular  velocity,  the  singular  value  functions  are  again 
nearly  constant  at  grid  points  close  to  the  origin.  Here,  the  two  functions  are 
nearly  equal,  taking  values  close  to  0.252  and  0.248.  This  closely  matches  the 
Hankel  singular  values  of  the  linearized  system.  Both  states  are  equally  important 
to  the  input-to-output  behavior  of  the  system.  This  is  expected  from  a  physical 
standpoint,  since  there  is  a  duality  between  the  torque  input  and  angular  velocity 
output. 

Remark  4.6.2  Because  this  example  system  has  only  two  states,  we  do  not  delete 
any  states  in  order  to  produce  a  reduced  model.  Such  a  reduction  would  be  dubious, 
since  a  system  with  only  one  state  is  qualitatively  different  from  a  two  state  model, 
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Pendulum  (Position  Output):  Singular  Value  Function  1 


Pendulum  (Position  Output):  Singular  Value  Function  2 


-0.5  -1  x  -0.5  -1 


Figure  4.10:  The  singular  value  functions  for  the  pendulum  system  with  position 
output.  Left:  cr\{x)  nearly  constant  0.367;  Right:  cr2(a;)  nearly  constant  0.284. 

Pendulum  (Velocity  Output):  Singular  Value  Function  1  Pendulum  (Velocity  Output):  Singular  Value  Function  2 


Figure  4.11:  The  singular  value  functions  for  the  pendulum  system  with  velocity 
output.  Left:  cri(x)  nearly  constant  0.252;  Right:  cr^x)  nearly  constant  0.248. 

i.e.,  cannot  exhibit  the  same  behaviors,  e.g.,  oscillation.  □ 

Algorithm  4.5.13  produces  discretized  approximations  to  the  functions  /,  g, 
and  h ,  i.e.,  gives  their  values  at  the  grid  points.  In  order  to  simulate  the  balanced 
system,  we  need  explicit  expressions  for  evaluating  these  functions  anywhere  in 
a  region  of  0.  Therefore,  we  approximate  the  discretized  functions  with  degree-4 
polynomials  using  a  linear  least  squares  approximation  scheme. 

We  computed  balanced  realizations  for  the  pendulum  systems  with  position 
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and  velocity  outputs  in  two  ways:  using  the  exact  controllability  function  given 
by  (4.171)  and  the  approximate  controllability  function  derived  from  Monte-Carlo 
data.  We  then  simulated  the  eight  systems  (original  and  balanced  coordinates; 
exact  and  approximate  controllability  function;  position  and  velocity  read-outs) 
using  two  input  signals:  u  =  0  (natural  response)  and  u(t)  =  0.5  sin  (t/n).  The 
output  responses  are  shown,  for  the  pendulum  system  with  measured  joint  angle 
and  angular  velocity,  respectively,  in  Figures  4.12  and  4.13. 

Theoretically,  the  output  responses  of  the  original  and  balanced  systems  should 
be  identical,  since  they  are  merely  different  representations  of  the  same  physical 
system.  However,  the  computations  introduce  numerical  error.  We  observe  that 
by  using  the  exact  controllability  function,  the  output  responses  of  the  original  and 
balanced  systems  are  virtually  identical.  Thus,  the  algorithms  for  computing  the 
Morse,  input-normal,  and  balancing  transformations  introduced  negligible  error. 
On  the  other  hand,  when  using  the  approximate  controllability  function  generated 
using  Monte-Carlo  data,  the  output  responses  of  the  original  and  balanced  systems 
deviate  somewhat.  Thus,  a  better  approximation  may  be  desirable,  which  can  be 
achieved  by  generating  additional  Monte-Carlo  data. 

4.6.2  Toward  a  Balanced  Realization  for  the  Double  Pen¬ 
dulum 

We  now  consider  a  double  pendulum  system  as  illustrated  in  Figure  4.14.  As  with 
the  pendulum  system  of  Section  4.6.1,  the  system  incorporates  linear  torsional 
damping,  linear  torsional  stiffness,  and  torque  inputs  at  the  rotary  joints.  We 
assume  that  the  shafts  are  massless  and  that  the  pendulum  moves  only  in  the 
plane.  We  measure  the  horizontal  position  of  the  end-effector  as  the  system  output 
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Pendulum:  Output  (Position)  Response  -  Zero  Input 


Pendulum:  Output  (Position)  Response  -  Sinusoidal  Input 


Pendulum:  Output  (Position)  Response  -  Zero  Input 


Pendulum:  Output  (Position)  Response  -  Sinusoidal  Input 


Figure  4.12:  Output  response  for  the  pendulum  system  with  position  read-out: 
original  coordinates  (solid)  vs.  balanced  coordinates  (dashed).  Top  left:  zero 
input,  exact  Lc;  Top  right:  sinusoidal  input,  exact  Lc;  Bottom  left:  zero  input, 
approximate  Lc\  Bottom  right:  sinusoidal  input,  approximate  Lc. 
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Pendulum:  Output  (Velocity)  Response  -  Zero  Input 


Pendulum:  Output  (Velocity)  Response  -  Sinusoidal  Input 


Pendulum:  Output  (Velocity)  Response  -  Zero  Input 


Pendulum:  Output  (Velocity)  Response  -  Sinusoidal  Input 


Figure  4.13:  Output  response  for  the  pendulum  system  with  velocity  read-out: 
original  coordinates  (solid)  vs.  balanced  coordinates  (dashed).  Top  left:  zero 
input,  exact  Lc;  Top  right:  sinusoidal  input,  exact  Lc;  Bottom  left:  zero  input, 
approximate  Lc\  Bottom  right:  sinusoidal  input,  approximate  Lc. 
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(a  nonlinear  function  of  the  state  variables). 


0 1 
T\ 
mi 
bi 

h 

Li 

G 


joint  1  angle 
joint  1  applied  torque 
1  end  mass  1 

1  joint  1  tor.  damping  coeff. 

1  joint  1  tor.  stiffness  coeff. 

1  length  of  shaft  1 

10  gravitational  acceleration 


02  joint  2  angle 

r2  joint  2  applied  torque 

7?t.2  1  end  mass  2 

&2  1  joint  2  tor.  damping  coeff. 

1  joint  2  tor.  stiffness  coeff. 
L2  1  length  of  shaft  2 


Figure  4.14:  Planar  double  pendulum  system  with  massless  shafts,  linear  torsional 
damping,  linear  torsional  stiffness,  and  torque  input  applied  at  the  rotary  joints. 
Values  of  parameters  are  provided  for  the  numerical  studies  that  we  conducted. 


State-Space  Realization 

As  before,  we  obtain  a  state-space  realization  (/,  g,  h )  for  the  pendulum  system  via 
the  Euler-Lagrangian  mechanics  outlined  in  Section  2.7.  Let  q  =  (di,02)  and  Q  = 
(0i,  62)  denote  the  generalized  positions  and  velocities  corresponding,  respectively, 
to  joint  angles  and  angular  velocities.  Let  the  generalized  forces  be  given  by  the 
applied  joint  torques,  i.e.,  F  =  (iq  —  t2,t2).  The  kinetic,  potential,  and  dissipation 
energies  are  given,  respectively,  by 
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K  (q,  q)  =  i  mi  L\  q\  +  -  m2  h\  q[ 

+  ^  m2  L\  {qi  +  q2f  +  m2  Lx  L2  cos  (q2)  qi  (qi  +  q2)  (4.181) 

U  (q,  q)  =  ^  fci  ql  +  \  k2  Qi 

-  (mi  +  m2)  G  Lx  cos  (qx)  -  m2  G  L2  cos  (qx  +  q2)  (4.182) 

R(q,q)  =  \hiql  +  \h2ql  (4.183) 


We  apply  the  Euler-Lagrange  equation  of  motion  (2.113),  i.e., 


d  dL  dL  _  p_dR 
dt  dq  dq  dq 


(4.184) 


for  L  =  K  —  U  to  obtain  the  equation  of  motion  for  the  double  pendulum  system, 
given  by 

M  (q)  q  +  C  (q,  q)  +  N  (q,  q)  =  F  (4.185) 


where 


M(q)  (4.186) 

(mi  +  m2)  L\  +  m2Ll  +  2  m2  Lx  L2  cos  (q2)  m2  L\  +  m2  Lx  L2  cos  (q2) 


m2  L\  +  m2  Lx  L2  cos  (q2) 


m2  L\ 


C  (q,  q)  = 


-m2  Lx  L2  sin  (q2)  (2qxq2  +  g|) 
m2  L i  L2  sin  (q2)  q\ 


(4.187) 


N  (, q ,  q)  = 


(mi  +  m2)  G  Lx  sin  (qx)  +  m2  G  L2  sin  (qx  +  q2)  +  kxqx  +  bx  qx 
m2  G  L2  sin  (qx  +  q2)  +  k2q2  +  b2  q2 

(4' 


88) 


The  affine  nonlinear  control  system  is  realized  in  coordinates 
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x  =  (x1,x2,x3,x4)  =  ( qi,q2,qi,q2 )  by 


f(%) 


<1 

—M~l  (q)  (C  (q,q)  +  N  (q,q)) 


and  h  (x)  =  Lx  sin  ( qx )  +  L2  sin  (qx  +  q2). 


g(x) 


0 

M-1  (q) 


(4.189) 


System  Properties 

We  have  verified  local  accessibility  and  local  observability  of  the  system  via  calcu¬ 
lations  performed  using  the  symbolic  computation  capabilities  of  MATLAB.  The 
expressions  for  the  various  brackets,  accessibility  algebra,  and  observability  codis¬ 
tribution  are  too  lengthy  and  complicated  to  include  here.  The  system  is  locally 
accessible  at  0  since 


dim  (span  [gu  [/,  gx]  ,  [/,  [/,  gx}}  ,  [/,  [/,  [/,  g^]]]}  |x=0)  =  4  (4.190) 


Furthermore,  the  system  is  locally  observable  at  0  since 


dim  (span  {dh,  dLfh ,  dL[fAgu[Lgi]]]h,  dL[Ugi:[gu[mm]h}  |x=0)  =  4 

Finally,  we  note  that  the  system  is  asymptotically  stable,  since,  for  A 
spec  (A)  G  (D~. 

The  linearization  about  0  is  given  by 


(4.191) 


d '£ 

dx 


(0), 


0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

B  = 

0 

0 

-11 

12 

-1 

2 

1 

-3 

12 

-35 

2 

-5 

-2 

7 

(4.192) 


and  C  =  [2  1  0  0].  The  Hankel  singular  values  of  the  linearization  are  0.5029, 

0.4702,  0.0249,  and  0.0106. 
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Controllability  Function 


The  double  pendulum  system  is  not  integrable  and  does  not,  in  general,  satisfy 
the  equipartition  of  energy  condition.  However,  in  the  special  case  where  b\  =  b  — 
b2  the  equipartition  of  energy  condition  is  satisfied  with  ratio  ^  =  b.  Applying 
Theorem  4.3.14,  the  controllability  function  for  the  double  pendulum  system  is 
given,  exactly,  by 

Lc(x)  =  (4.193) 

b  (mi  +m2)  L\x\  +  bm2  L\  ( x3  +  x4)2  +  2  b m2  Lv  L2  cos  ( x2 )  x3  (x3  +  x4 )  + 
bki  xl  +  k2  x\  —  2  b  (mi  +  m2 )  G  Li  cos  (xi)  —  2  bm2  G  L2  cos  (xi  +  x2)  + 

2  b  G  ((mi  +  m2)  Li  +  m2  L2)  (4.194) 

and  after  substitution  of  the  parameter  values  given  in  Figure  4.14  by 

Lc(x)  =  2xg  +  (x3  +  x4)2  +  2  cos  (x2)  x3  (x3  +  x4)  +  x\  +  x\ 

—40  cos  (xi)  —  20  cos  (xi  +  x2)  +  60  (4.195) 

The  controllability  function  Lc  for  the  double  pendulum  system  is  shown  in 
Figure  4.15. 

Observability  Function 

We  use  Algorithm  4.5.14  to  compute  an  approximation  of  the  observability  function 
L0 ,  for  the  double  pendulum  system,  shown  in  Figure  4.16. 

Balanced  Realization 

Figure  4.17  shows  a  2-dimensional  slice  of  each  singular  value  function  for  the  dou¬ 
ble  pendulum  system.  At  the  origin,  the  singular  value  functions  take  the  values, 
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respectively,  0.487,  0.444,  0.135,  and  0.050.  These  values  are  reasonable  close  to 
what  we  expect  from  the  Hankel  singular  values  of  the  linearization.  Two  states 
of  the  balanced  realization  have  considerably  greater  input-to-output  importance 
than  the  other  two  states.  We  also  observe  that  numerical  errors  are  more  promi¬ 
nent  for  the  singular  value  functions  of  small  magnitude,  i.e.,  the  oscillations  that 
they  display  are  likely  caused  by  numerical  error  rather  than  being  an  accurate 
reflection  of  their  actual  behavior. 


4.7  Remarks 

We  have  presented  methods  and  algorithms  to  compute  the  energy  functions  and 
coordinate  transformations  involved  in  the  Scherpen  theory  and  procedure  for  non¬ 
linear  balancing.  We  have  shown  that,  under  certain  conditions,  an  exact  formula 
for  the  controllability  energy  function  can  be  derived.  We  applied  our  result  to 
compute  the  controllability  function  for  a  4-dimensional  mechanical  system.  For 
other  situations,  we  offer  a  Monte-Carlo  approach  for  approximating  the  controlla¬ 
bility  function.  For  a  2-dimensional  mechanical  system,  the  Monte-Carlo  approach 
yielded  a  good  approximation. 

We  have  presented  an  algorithm  for  a  numerical  implementation  of  the  Morse- 
Palais  lemma,  which  produces  a  local  coordinate  transformation  under  which  a 
real-valued  function  with  a  non-degenerate  critical  point  is  quadratic  on  a  neigh¬ 
borhood  of  the  critical  point.  Application  of  the  algorithm  to  the  controllability 
function  plays  a  key  role  in  computing  the  balanced  representation. 

We  have  applied  our  methods  and  algorithms  to  derive  balanced  realizations  for 
nonlinear  state-space  models  of  two  example  mechanical  systems.  Simulation  re¬ 
sults  demonstrate  that  the  algorithms  produce  accurate  and  useful  approximations 
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to  the  energy  functions  and  transformations  involved  in  the  nonlinear  balancing 
procedure.  For  a  2-dimensional  system,  the  approximate  balanced  realization  pro¬ 
duced  input-to-output  behavior  that  is  nearly  equivalent  to  that  generated  by  the 
original,  physically  derived,  realization.  Thus,  it  serves  as  an  equivalent  represen¬ 
tation,  with  the  benefit  that  it  provides  a  meaningful  ranking  of  state  components 
for  purposes  of  model  reduction. 

The  algorithms  are  currently  too  computationally  intensive  to  be  practical  for 
high-order  systems.  It  is  likely  that  new  algorithms  will  be  required  in  order  for  the 
Scherpen  procedure  to  become  genuinely  useful  for  model  reduction  of  nonlinear 
systems. 
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Double  Pendulum:  Controllability  Function  (x3=0=x4) 


Double  Pendulum:  Controllability  Function  (>^=0=x4) 


Double  Pendulum:  Controllability  Function  (>^=0=x3) 


Double  Pendulum:  Controllability  Function  (x1=0=x3) 


Double  Pendulum:  Controllability  Function  (x|=0=x4) 


Double  Pendulum:  Controllability  Function  (x,=0=x2) 


Figure  4.15:  Controllability  function  for  double  pendulum  (6  planes).  Top  left 
£3  =  0  =  £4;  Top  right:  x2  =  0  =  £4;  Mid  left:  x2  =  0  =  £3;  Mid  right 
X\  =  0  =  X4:  Bottom  left:  x\  =  0  =  £3;  Bottom  right:  x\  =  0  =  x2. 


Double  Pendulum:  Observability  Function  ()^=0=x4) 


Double  Pendulum:  Observability  Function  (>^=0=x4) 


-1  _i  x  -1  _i 


Figure  4.16:  Observability  function  for  double  pendulum  (6  planes).  Top  left 
x3  =  0  =  x4;  Top  right:  x2  =  0  =  x4;  Mid  left:  x2  =  0  =  x3]  Mid  right 
X\  =  0  =  X4,  Bottom  left:  x\  =  0  =  £3;  Bottom  right:  x\  =  0  =  x2. 
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Double  Pendulum:  Singular  Value  Function  1  (x|=0=x2) 


Double  Pendulum:  Singular  Value  Function  2  (x|=0=x2) 


-1  -1  x  -1  -1 


Figure  4.17:  Singular  value  functions  for  double  pendulum  (X3-X4  plane).  Top  left 
<Ti(x)]  Top  right:  cr2{x)]  Bottom  left:  cr3(x);  Bottom  right:  <t4(x). 
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Chapter  5 


Modeling  and  Optimization  for 
Silicon  Growth  via  RTCVD 

5.1  Introduction 

This  chapter  addresses  the  problem  of  developing  high-fidelity  physical-chemical 
models  for  predicting  the  behavior  and  output  of  a  commercial  rapid  thermal 
CVD  reactor  used  for  depositing  thin  dims  of  Si  and  Si-Ge  on  silicon  wafers. 
The  problem  is  studied  within  the  context  of  a  joint  project  between  the  ISR  of 
the  University  of  Maryland,  College  Park,  and  Northrop  Grumman  ESSS  (NG- 
ESSS),  for  modeling  and  optimization  of  epitaxial  growth  in  the  ASM  Epsilon-1 
rapid  thermal  CVD  reactor  [115,  116,  117].  The  Epsilon-1  is  a  single- wafer  lamp- 
heated  CVD  reactor  manufactured  by  ASM  America,  Inc.,  Phoenix,  AZ.  It  is  used 
by  NG-ESSS  to  deposit  layers  of  epitaxial  Si-Ge,  epitaxial  silicon  (epi-Si),  and 
poly  crystalline  silicon  (poly-Si)  on  a  silicon  wafer. 

The  modeling  of  fundamental  aspects  of  CVD  involves  both  chemical  kinet¬ 
ics  and  transport  phenomena.  Depending  on  the  specific  process  and  operating 
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conditions,  it  is  often  assumed  that  one  or  more  factors  has  significantly  more  in¬ 
fluence  than  all  others  over  deposition  product.  In  those  cases,  the  factors  that 
are  considered  less  important  are  often  completely  or  mostly  neglected.  This  type 
of  simplification  has  been  adopted  frequently  in  order  to  formulate  relatively  sim¬ 
ple  models  describing  specific  components  of  the  CVD  process  (e.g.,  heat  transfer 
within  and  among  solids)  under  a  limited  range  of  operating  conditions  (e.g.,  low- 
temperature  regime)  for  a  specific  purpose  (e.g.,  temperature  control). 

For  example,  NG-ESSS  is  interested  in  models  that  focus  on  low  temperature 
silicon  epitaxy.  In  the  low  temperature  regime,  growth  rate  is  limited  by  surface 
reaction  phenomena  rather  than  by  mass  transport  phenomena.  Furthermore, 
surface  reaction  phenomena  such  as  adsorption  and  desorption  of  reactant  species 
are  strongly  dependent  on  temperature.  For  this  reason,  the  low  temperature 
regime  is  said  to  be  thermally  driven  (kinetically  limited).  These  simplifications 
motivate  an  approach  in  which  physical-chemical  models  consist  only  of  a  simplified 
conjugate  heat  transfer  model  for  thermal  dynamics  together  with  an  Arrhenius  law 
for  growth  rate  in  terms  of  temperature.  In  this  approach,  dynamics  of  transport 
phenomena  are  neglected,  inlet  conditions  for  gas  phase  species  concentrations  and 
temperature  are  assumed  to  hold  throughout  the  process  chamber,  and  chamber 
geometry  plays  no  role. 

However,  as  we  show  in  this  chapter,  these  simplifications  seriously  compromise 
the  utility  of  such  models  for  purposes  of  studying  uniformity  issues  for  thin  him 
growth  in  the  Epsilon-1.  We  demonstrate  that  chamber  geometry  and  a  variety 
of  complex  phenomena  are  essential  elements,  including  depletion  of  reactants, 
non-uniform  gas  heating,  gas  phase  chemistry,  thermal  diffusion,  and  gas  how  pat¬ 
terns.  This  necessitates  the  incorporation  of  detailed  models  for  three-dimensional 
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effects  of  gas  flow,  gas  phase  heat  transfer,  and  transport  of  chemical  species,  in 
addition  to  the  previously  mentioned  heat  transfer  phenomena.  Furthermore,  a 
more  comprehensive  model  for  surface  reaction  chemical  kinetics  is  required  to  in¬ 
corporate  reactive  intermediaries  produced  in  the  gas  phase.  Finally,  it  is  crucial 
that  the  models  reflect  the  coupling  among  these  various  phenomena  in  the  process 
chamber. 

We  address  these  problems  via  development  of  a  process-equipment  model 
which  accounts  for  the  mechanisms  and  factors  described  above.  The  model  is 
capable  of  predicting  gas  flow,  heat  transfer,  species  transport,  and  chemical  mech¬ 
anisms  in  the  reactor  given  a  process  recipe  for  temperature,  pressure,  and  flow 
rate  set-points.  It  provides  a  platform  for  studying  the  effect  of  equipment  set¬ 
tings  and  a  relatively  broad  range  of  process  conditions  on  deposition  product 
characteristics,  e.g.,  deposition  rate  and  thickness  uniformity. 

The  modeling  effort  includes  development  of  physical  and  chemical  models  for 
fundamental  CVD  phenomena,  experimental  determination  of  growth  parameters, 
and  experimental  validation  of  model  predictions.  Simulations  are  used  as  tools  to 
predict  deposition  results,  study  the  factors  that  affect  deposition  uniformity,  and 
determine  operating  parameters  for  improved  performance  and  product  quality. 

Specific  applications  include  prediction  and  control  of  deposition  rate  and  thick¬ 
ness  uniformity;  studying  sensitivity  of  deposition  rate  to  process  settings  such  as 
temperature,  pressure,  and  flow  rates;  and  reducing  the  use  of  consumables  via 
purge  flow  optimization.  The  implications  of  various  simulation  results  are  dis¬ 
cussed  in  terms  of  how  they  can  be  used  to  reduce  costs  and  improve  product 
quality,  e.g.,  thickness  uniformity  of  thin  films.  We  demonstrate  that  achieving 
deposition  uniformity  requires  some  degree  of  temperature  non- uniformity  to  corn- 
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pensate  for  the  effects  of  other  phenomena  such  as  reactant  depletion,  gas  heating 
and  gas  phase  reactions,  thermal  diffusion  of  species,  and  flow  patterns. 

The  semiconductor  manufacturing  environment  in  which  this  research  was  con¬ 
ducted  is  described  in  Section  5.2,  including  manufacturing  objectives,  equipment 
and  materials  involved,  and  a  typical  manufacturing  situation  of  interest.  The 
procedure  and  results  of  poly-Si  growth  experiments  for  studying  equipment  oper¬ 
ation  and  deposition  characteristics  are  given  in  Section  5.3.  The  overall  process- 
equipment  model  for  silicon  growth  in  the  Epsilon-1  is  detailed  in  Section  5.4, 
including  various  components  for  prediction  of  the  relevant  transport  phenomena 
and  chemical  mechanisms.  In  Section  5.5,  we  apply  the  model,  via  simulation, 
to  study  the  factors  that  influence  deposition  rate  and  uniformity,  and  present 
the  results  and  analysis.  We  summarize  and  make  some  additional  remarks  in 
Section  5.6. 


5.2  Semiconductor  Manufacturing  Environment 

The  modeling  and  analysis  presented  in  this  chapter  pertains  specifically  to  the 
semiconductor  manufacturing  environment  at  NG-ESSS,  e.g.,  silicon  epitaxy  us¬ 
ing  the  Epsilon- 1  reactor,  and  is  motivated  specifically  by  problems  encountered 
within  that  environment.  In  this  section  we  state  the  manufacturing  objectives 
that  motivated  the  modeling  effort,  provide  relevant  details  about  the  equipment 
and  processes,  and  offer  some  perspective  and  additional  motivation  through  a 
case  study  describing  a  typical  manufacturing  situation  of  interest. 
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5.2.1  Manufacturing  Objectives 

The  overall  objective  of  this  research  is  to  improve  manufacturing  effectiveness  for 
epitaxial  growth  of  silicon  and  Si-Ge  thin  dims  on  a  silicon  wafer  in  the  Epsilon-1 
reactor,  a  production  tool  currently  in  use  at  NG-ESSS.  Improvement  in  product 
quality,  increased  flexibility  of  operation,  and  reduction  of  manufacturing  costs  are 
integral  to  achieving  the  overall  objective.  We  provide  specific  details  in  the  follow¬ 
ing  explanations  and  descriptions,  based  on  discussions  with  and  demonstrations 
by  NG-ESSS  personnel  [128]. 

Within  the  scope  of  this  research,  product  quality  is  determined  solely  by  depo¬ 
sition  thickness  uniformity.  Other  factors,  such  as  film  composition  and  resistivity 
uniformity,  are  important  quality  measures  but  are  not  considered  here.  Thickness 
variations  of  5%  are  currently  acceptable  for  most  applications,  although  there  is 
no  guarantee  that  such  a  specification  will  remain  stable.  Currently,  variations  in 
the  range  of  2%  are  routinely  achieved  with  the  Epsilon-1.  Improved  results  are 
always  desirable. 

The  Epsilon-1  is  capable  of  operation  in  several  regimes  for  pressure,  temper¬ 
ature,  and  flow  rates,  and  deposition  via  injection  of  several  types  of  precursor 
and  carrier  gases.  Prediction  of  deposition  rates  and  other  him  characteristics  for 
a  given  combination  of  process  conditions  is  key  to  taking  advantage  of  the  ma¬ 
chine’s  flexibility.  The  manufacturer  provides  some  predictive  guidance  and  data. 
However,  there  is  the  desire  for  manufacturing  “off-the-curve,”  i.e.,  operating  in 
regimes  and  producing  films  with  characteristics  that  do  not  appear  in  manufac¬ 
turer  provided  information. 

Performance  of  devices  at  high  frequency  is  difficult  to  predict  based  on  proper¬ 
ties  of  the  product  and  manufacturing  parameters.  To  achieve  a  product  with  the 
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desired  properties,  NG-ESSS  operates  with  a  three  to  four  month  manufacturing 
cycle  followed  by  a  long  testing  cycle.  It  can  take  up  to  two  years  to  converge 
on  the  desired  product.  Each  manufacturing  cycle  requires  an  initial  period  of 
experimentation  in  which  the  necessary  equipment  settings  and  process  conditions 
are  determined.  Once  parameters  are  determined,  and  the  customer  is  satisfied, 
the  process  is  certified,  and  parameters  are  usually  not  changed  for  several  years  in 
order  to  provide  the  customer  with  a  consistent  product.  It  is  possible,  however, 
for  drift  of  equipment  characteristics  over  time  to  cause  degradation,  necessitating 
additional  experimentation.  In  addition,  the  chamber  tube  is  periodically  cleaned 
and  replaced,  requiring  a  re-calibration  of  process  settings.  The  result  is  that  the 
various  trial-and-error  steps  have  a  significant  impact  on  time-to-manufacture  and 
other  production  costs. 

Additional  cost  concerns  include  operational  integrity  and  “down-time”  of  equip¬ 
ment,  and  the  use  of  consumables  such  as  process  gases.  It  is  clear  that  reductions 
in  experimental  steps,  equipment  failure,  and  gas  consumption  will  have  a  benefi¬ 
cial  impact  on  manufacturing  costs. 

In  light  of  manufacturing  objectives,  the  modeling  effort  described  in  this  chap¬ 
ter  seeks  to  gain  an  understanding  of  the  processes  and  equipment  via  physical  and 
mathematical  modeling,  and  use  the  resulting  validated  models  for  optimization 
of  process  conditions  and  equipment  settings. 

5.2.2  Equipment  and  Materials 

The  Epsilon-1  reactor  is  a  radiantly  heated,  gas  injected,  single  wafer  processing 
system  for  CVD  of  doped  or  undoped  epitaxial  and  polycrystalline  layers  on  a 
150  mm  (6  in)  diameter  semiconductor  wafer.  In  this  section  we  provide  some 
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descriptive  background  material  necessary  for  model  development,  including  char¬ 
acteristics  of  the  process  chamber  and  deposited  and  consumed  materials;  and 
an  overview  of  reactor  operation  including  typical  processing  recipes,  operating 
conditions,  equipment  settings,  and  overall  system  structure. 

Process  Chamber 

The  process  chamber  is  situated  within  the  Epsilon-1  reactor  system,  accessible  by 
the  wafer  handling  system  and  between  the  parts  of  the  lamp  assembly,  as  shown 
in  Figure  5.1.  Also  shown  is  a  cross-sectional  front  view  of  the  process  chamber 
and  wafer  rotation  apparatus.  A  cross-sectional  side  view  of  the  process  chamber 
and  lamp  assembly,  and  a  top-down  view  of  the  wafer  level  apparatus,  are  shown, 
respectively,  in  Figures  5.2  and  5.3.  The  inlet  and  outlet  sides  of  the  reactor  are 
referred  to  as  the  front  (upstream)  and  rear  (downstream),  respectively. 


Figure  5.1:  Epsilon-1  reactor  system  (left)  and  cross-section  (front  view)  of  the 
process  chamber  and  wafer  rotation  apparatus  (right).  Source:  ASM  Epsilon-1 
Reactor  Manual. 

Deposition  takes  place  in  the  process  chamber,  which  is  a  horizontally  oriented 
quartz  tube  of  lenticular  shape,  i.e.,  a  cross-sectional  view  looking  into  the  chamber 
front  shows  a  flat  bottom,  short  vertical  sides,  and  curved  top.  The  quartz  shelves 
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Figure  5.2:  Cross-section  (side  view)  of  the  Epsilon-1  process  chamber  and  lamp 
assembly.  Source:  ASM  Epsilon- 1  Reactor  Manual. 
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Figure  5.3:  Overhead  view  of  the  Epsilon-1  at  wafer  level  including  thermocouple 
locations. 

are  connected  to  the  quartz  chamber  walls  to  form  a  contiguous  solid  body. 

The  150  mm  (6  in)  diameter  wafer  rests  on  small  quartz  pins  attached  to  the 
pocket  of  a  rotating  susceptor  that  is  surrounded  by  the  susceptor  ring.  The 
susceptor  and  ring  are  constructed  of  graphite  coated  with  silicon-carbide.  The 
susceptor  ring  fits  into  a  space  within  the  quartz  structure,  leaving  a  thin  gap 
between  ring  and  shelf  on  all  sides.  The  susceptor  fits  into  the  ring  structure,  and 
is  supported  and  rotated  by  special  apparatus  located  through  and  under  the  lower 
chamber  section.  There  is  also  a  gap,  although  much  smaller,  between  the  ring 
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and  susceptor. 

Process  gases  are  pumped  into  the  chamber  through  the  inlet  flange,  flow  hori¬ 
zontally  through  the  chamber  over  the  wafer  surface,  and  are  pumped  out  through 
the  exhaust  flange  via  pneumatic  actuators.  An  optional  flow  guide  can  be  used 
to  force  the  inlet  flow  away  from  the  chamber  roof  and  toward  the  wafer.  The 
inlet  flange  is  designed  to  create  a  specialized  sonic  flow  with  possible  swirling  and 
mixing  properties  as  the  gas  enters  the  process  chamber.  It  has  been  observed  by 
the  manufacturer  that  the  inlet  flange  design  aids  in  achieving  deposition  unifor¬ 
mity  [106]. 

The  chamber  is  divided  by  the  susceptor,  ring,  and  a  quartz  shelf  into  upper 
and  lower  sections.  Process  gases  enter  and  flow  through  the  upper  section;  purge 
gases  enter  into  the  lower  section.  Thin  gaps  between  the  quartz  shelf  and  the 
ring,  and  between  the  ring  and  susceptor,  allow  gas  to  flow  between  upper  and 
lower  chamber  sections.  In  addition,  diffusion  of  species  from  upper  to  lower  and 
visa-versa  can  occur  due  to  concentration  and  thermal  gradients.  For  this  reason, 
purge  gases  are  pumped  into  the  lower  region  through  the  susceptor  rotation  shaft 
and  a  purge  inlet  in  the  front  wall.  The  purge  flow  prevents  the  process  gases 
from  escaping  to  the  lower  section,  which  can  result  in  unwanted  deposition  on  the 
back-side  of  the  susceptor. 

The  wafer  and  chamber  are  heated  by  upper  and  lower  arrays  of  linear  tungsten- 
halogen  lamps,  and  four  spot  lamps  directed  at  the  center  of  the  susceptor  (see 
Section  6.3  for  details  and  analysis).  The  upper  and  lower  lamp  arrays  illuminate, 
respectively,  the  top  surface  of  the  wafer  and  the  bottom  of  the  susceptor.  Heat 
radiation  is  intensified  by  gold  coated  reflectors  surrounding  the  process  chamber 
on  all  sides.  Four  thermocouples  measure  the  temperature  at  the  center,  front, 
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rear,  and  side  of  the  susceptor.  We  note  that  while  the  center  thermocouple  is 
located  in  contact  with  the  center  of  the  susceptor,  the  other  three  thermocouples 
are  located  at  the  front,  rear,  and  side  of  the  ring  that  surrounds  the  susceptor. 
Thus,  susceptor  temperature  is  measured  only  approximately  at  points  other  than 
the  center. 

The  quartz  chamber  and  lamp-house  are  cooled  by  air  flow.  All  components 
are  contained  in  a  stainless  steel  enclosure. 

Product  and  Consumables 

NG-ESSS  uses  the  Epsilon-1  reactor  to  deposit  thin  Elms  of  epitaxial  Si-Ge,  epi- 
Si,  and  poly-Si.  Epitaxial  growth,  or  epitaxy,  (see,  e.g.,  [134,  162])  refers  to  the 
deposition  of  a  thin  layer  of  material  onto  the  surface  of  a  single-crystal  substrate 
in  such  a  manner  that  the  layer  is  also  single-crystal  and  has  a  fixed  and  predeter¬ 
mined  crystallographic  orientation  with  respect  to  the  substrate.  Epitaxial  layers 
are  deposited  on  silicon  wafers  that  are  either  bare  or  covered  with  a  patterned 
layer  of  silicon  dioxide  (Si02).  Poly-Si  is  deposited  on  a  layer  of  silicon  dioxide. 

The  major  source  gases  used  to  deposit  epi-Si  layers  commercially  are  (see, 
e.g.,  [134,  156]) 

•  silane  (SiH4)  at  low  temperatures  (<  1000  C);  and 

•  silicon  tetrachloride  (SiCl4),  dichlorosilane  (SiH2Cl2),  and  trichlorosilane 
(SiHCl3)  at  higher  temperatures. 

This  research  deals  with  low  temperature  growth,  for  which  NG-ESSS  uses  silane 
as  the  source  gas.  Germane  (GeH4)  is  added  to  the  mixture  for  growth  of  Si-Ge. 
The  carrier  gas  is  either  hydrogen  (H2)  or  nitrogen  (N2).  Dopant  precursors  such 
as  arsine  (ArHy)  can  also  be  added  to  the  mixture. 
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There  is  a  direct  relationship  between  growth  of  poly-Si  and  epi-Si.  The  pro¬ 
cessing  steps  are  identical,  and  growth  rates  are  virtually  identical  [128].  Experi¬ 
ments  are  performed  at  NG-ESSS  by  depositing  poly-Si  rather  than  epi-Si,  because 
they  have  tools  for  measuring  poly-Si  Elms  (e.g.,  nanospec,  ellipsometer)  but  not 
for  epi-Si  (e.g.,  SIMS).  (It  is  the  presence  of  the  oxide  boundary  that  allows  for 
easier  measurement  of  poly-Si.)  Thus,  poly-Si  experiments  allow  for  rapid  process 
evaluation.  For  this  reason,  NG-ESSS  sometimes  performs  epi-Si  and  poly-Si  ex¬ 
periments  in  parallel,  with  measurements  taken  for  the  poly-Si  Elms.  In  light  of 
this,  we  performed  experiments  and  simulations  for  growth  of  poly-Si.  The  pro¬ 
cess  gases  we  used  were  a  mixture  of  2%  SiH4  diluted  in  H2  as  the  silicon  source, 
together  with  20  slm  H2  as  the  carrier,  except  as  noted  otherwise. 

Specifications  for  the  final  product  (thin  film)  include  characteristics  such  as 
chemical  composition,  film  thickness,  dopant  concentration,  crystal  structure,  re¬ 
sistivity,  and  possibly  other  factors.  Both  aggregate  and  spatially  distributed  quan¬ 
tities  are  important.  Usually,  spatial  uniformity  is  specified  by  a  variation  tolerance 
across  the  wafer  surface,  e.g.,  5%  allowable  non-uniformity. 

In  general,  aggregate  characteristics  such  as  average  growth  rate  are  determined 
by  process  conditions  for  temperature,  pressure,  and  flow  rates  as  set  by  the  user  in 
process  recipes.  The  spatial  distributions  of  the  various  properties,  and  hence  uni¬ 
formity,  are  mainly  controlled  by  equipment  settings  such  as  thermocouple  offsets 
and  injector  opening  sizes.  These  two  methods  of  equipment  and  process  control 
are  described  below. 
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Process  Conditions  and  Recipes 


In  order  to  achieve  the  desired  aggregate  characteristics,  the  process  engineer  de¬ 
signs  step-by-step  recipes.  Each  step  performs  a  particular  task  such  as  etch,  bake, 
purge,  or  deposit,  for  a  specified  amount  of  time.  We  consider  only  the  deposition 
steps  here.  The  process  engineer  specifies,  for  each  deposition  step,  the  choice 
of  source,  carrier,  and  purge  gases,  set-points  for  temperature,  pressure,  and  flow 
rates,  and  time  duration.  We  refer  to  these  specifications  as  recipe  inputs.  They 
are  programmed  into  the  Epsilon- 1  microprocessor  and  controlled  automatically  in- 
situ.  For  example,  PID  controllers  and  mass  flow  controllers  (MFCs)  regulate  the 
thermocouple  temperatures  and  inlet  flow  rates  around  their  respective  set-points. 

Si-Ge  Elms  are  deposited  at  a  temperature  of  675  C.  This  falls  within  the 
low  temperature  regime  which  is  roughly  600-800  C.  At  low  temperature,  surface 
reactions  are  thermally  activated  and  controlled  by  deposition  kinetics.  NG-ESSS 
also  deposits  some  Elms  in  the  high  temperature  regime  which  is  roughly  900-1100 
C.  At  high  temperature,  surface  reactions  are  mass  transport  controlled.  However, 
temperature  regulation  is  still  important,  as  it  determines  layer  resistivity,  and 
large  temperature  gradients  can  cause  slip,  i.e.,  mechanical  damage  to  the  wafer. 
All  growth  data  in  this  study  is  restricted  to  the  low  temperature  regime. 

The  Epsilon-1  reactor  is  capable  of  growth  at  atmospheric  pressure  (AP)  and 
reduced  pressure  (RP)  which  is  roughly  10-100  Torr.  For  this  research,  we  per¬ 
formed  deposition  in  the  RP  regime  at  20  Torr  and  40  Torr.  The  flow  rate  for  each 
individual  process  and  purge  gas  used  is  specified  in  standard  liters  per  minute 
(slm)  or  standard  cubic  centimeters  per  minute  (seem).  The  process  gases,  e.g., 
hydrogen  carrier  and  silane  source,  are  mixed  prior  to  injection  into  the  chamber. 
The  purge  flow  rate  is  set  to  prevent  mixing  between  upper  and  lower  chamber 
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sections,  and  is  generally  the  same  for  each  recipe.  Since  its  impact  is  on  equip¬ 
ment  integrity  rather  than  product  characteristics,  we  treat  it  separately  from  the 
other  recipe  inputs. 

Equipment  Settings 

Reactor  operation  can  be  adjusted  ex-situ  via  several  mechanisms  that  are  included 
in  certain  components  of  the  reactor.  The  process  engineer  can  set  the  size  of 
gas  injector  openings  in  the  inlet  flange,  the  relative  power  setting  for  each  lamp 
group,  PID  feedback  gains,  susceptor  rotation  rate,  and  thermocouple  offsets.  We 
refer  to  these  as  equipment  settings.  In  contrast  to  recipe  inputs,  the  equipment 
settings  are  semi-permanent,  i.e.,  they  are  not  changed,  in  general,  for  each  different 
process  recipe.  Rather,  once  an  equipment  setting  is  adjusted  so  that  the  reactor 
yields  acceptable  dims,  it  remains  fixed  from  run  to  run  until  process  drift  or  tube 
replacement  necessitates  an  adjustment.  The  equipment  settings  play  a  key  role  in 
achieving  spatial  uniformity  of  deposition  thickness  in  the  Epsilon- 1.  We  elaborate 
on  some  of  these  settings  here. 

As  stated  earlier,  wafer  temperature  is  set  as  a  recipe  input.  However,  this 
one  setting  does  not  provide  the  capability  to  adjust  the  temperature  distribution 
across  the  wafer  surface.  The  necessary  additional  degrees  of  freedom  are  provided 
by  the  thermocouple  offsets.  There  are  three  offsets,  one  each  for  thermocouples 
at  the  front,  rear,  and  side  of  the  susceptor.  The  center  thermocouple  has  no 
offset,  and  its  temperature  is  regulated  about  the  recipe  temperature  set-point. 
The  temperatures  of  the  other  thermocouples  are  regulated  about  the  sum  of  the 
temperature  set-point  and  the  corresponding  offset.  For  example,  suppose  the 
temperature  set-point  is  given  in  the  recipe  as  700  C,  and  the  front  offset  is  given 
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as  -20  C.  Then  the  center  temperature  is  regulated  about  700  C  and  the  front 
temperature  is  regulated  about  680  C.  Use  of  the  offsets  has  the  effect  of  creating 
four  separate  temperature  set-points.  However,  even  with  the  additional  degrees  of 
freedom,  the  authority  to  control  the  entire  susceptor  temperature  profile  is  limited. 
The  profile  can  be  set  only  roughly  at  points  other  than  at  the  thermocouple 
locations.  Offsets  are  currently  set  via  trial-and-error  and  heuristic  methods. 

In  the  inlet  flange  currently  installed  in  the  Epsilon-1  at  NG-ESSS,  there  is  a 
set  of  three  gas  injector  slits  with  adjustable  widths.  These  are  used  for  adjusting 
the  flow  profile  at  the  inlet  to  the  process  chamber.  We  note  that  the  manual 
adjustment  of  slit  widths  is  difficult,  and  the  widths  can  be  measured  only  approx¬ 
imately.  In  the  future,  this  equipment  will  be  replaced  by  a  set  of  five  injector 
port  orifices  with  adjustable  diameters.  The  new  gas  supply  equipment  will  allow 
for  tighter  control  and  more  degrees  of  freedom  in  determining  the  inlet  flow  pro¬ 
file.  Either  way,  however,  the  authority  to  control  the  flow  characteristics  is  once 
again  limited.  The  manner  in  which  the  size  of  gas  injector  openings  affect  the 
flow  profile  is  known  only  roughly.  The  size  of  gas  injector  openings  are  currently 
determined  via  trial-and-error  and  heuristic  methods. 

Wafer  rotation  is  used  to  smooth  non-uniform  heating  and  other  effects.  It  is 
typically  set  at  35  rpm  for  most,  if  not  all,  production  runs. 

Operating  Structure 

An  overview  of  the  general  operating  structure  of  the  Epsilon-1  reactor,  from  the 
viewpoint  of  how  recipe  inputs  and  equipment  settings  affect  reactor  operation, 
is  presented  in  Figure  5.4.  Note  that  the  internal  details  of  individual  blocks  are 
not  included  here.  They  will  be  discussed  whenever  relevant  later  in  this  report. 
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Of  particular  interest  is  the  process  chamber  block,  which  contains  physical  and 
chemical  mechanisms  for  him  growth. 


Figure  5.4:  Overview  of  general  operating  structure  of  Epsilon-1  reactor,  from  the 
viewpoint  of  how  recipe  inputs  and  equipment  settings  affect  reactor  operation. 
Numbers  in  parentheses  refer  to  the  number  of  distinct  signals  in  the  associated 
path. 


5.2.3  Uniformity  Case  Study 

An  essential  purpose  of  the  process-equipment  model  is  to  predict  the  steady  state 
deposition  rate  with  emphasis  on  the  spatial  distribution  of  him  thickness.  It 
is  crucial,  then,  to  identify  those  phenomena  that  are  important  to  determining 
growth  rate,  and  include  the  effects  of  those  phenomena  in  the  model.  It  is  also 
necessary  to  identify  and  include  relevant  features  of  the  reactor  geometry  and 
operation,  and  to  incorporate  sufficient  spatial  resolution  and  dimensionality. 
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For  thermally  activated  thin  him  growth,  it  is  usually  assumed  or  implied  in  the 
literature  that  achieving  deposition  uniformity  is  tantamount  to  achieving  temper¬ 
ature  uniformity  across  the  wafer  surface.  Optimization  and  control  strategies  are 
then  designed  to  achieve  the  temperature  uniformity  objective  via  manipulation 
of  lamp  power  settings,  so  that  deposition  uniformity  is  achieved  via  automatic 
lamp  control.  Examples  of  such  studies  are  found  in  [28,  30,  63,  139,  151,  152], 
Sometimes  the  assumption  is  justified  by  stating  that  wafer  rotation  will  average 
out  all  other  factors. 

However,  the  assumption  of  equivalence  between  temperature  uniformity  and 
deposition  uniformity  does  not  necessarily  hold,  even  for  thermally  driven  processes 
and  processes  in  which  the  wafer  is  rotating.  For  example,  consider  the  experience 
of  NG-ESSS  with  deposition  of  epi-Si  or  Si-Ge  in  the  thermally  driven  regime 
(approximately  600-800  degrees  C)  in  the  Epsilon- 1  reactor.  The  process  engineer 
achieves  thickness  variations  of  less  than  1.5%  (considered  acceptable  uniformity) 
for  a  non-rotating  wafer  by  setting  thermocouple  offsets  at  -25,  -60,  and  -35  for 
front,  rear,  and  side,  respectively  [128].  These  values  were  determined  via  trial-and- 
error  growth  experiments.  If  growth  rate  were  affected  only  by  temperature  and  no 
other  factors,  then  the  1.5%  thickness  variation  that  is  achieved  with  those  offsets 
would  correspond  to  a  0.075%  temperature  variation  across  the  wafer  surface,  or 
roughly  0.5  degrees  C.  However,  for  a  recipe  temperature  set-point  of  700  C,  the 
corresponding  thermocouple  set-points  are  center  at  700  C,  front  at  675  C,  rear 
at  640  C,  and  side  at  665  C,  for  a  maximum  deviation  of  8.5%,  as  illustrated 
in  Figure  5.5.  Thus,  the  non- uniformity  imposed  by  the  offsets  appears  to  be 
significantly  greater  than  that  indicated  by  actual  growth  rates. 

The  set-up  of  the  reactor  apparatus  may  partially  explain  the  smaller  variation 
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Figure  5.5:  An  example  of  how  thermocouple  offsets  (front  -25  C,  rear  -60  C,  side  - 
35  C)  influence  the  temperature  set-points  around  which  the  four  thermocouples 
are  regulated  by  PID  controllers. 

in  actual  growth  rates.  Recall  that  the  front,  rear,  and  side  thermocouples  are 
located  in  the  ring  surrounding  the  susceptor,  rather  than  in  the  susceptor  itself. 
Furthermore,  the  150  mm  (6  in)  diameter  wafer  is  resting  on  quartz  pins  at  the 
center  of  the  225  mm  (8.85  in)  diameter  susceptor.  Therefore,  it  is  reasonable  to 
assume  that  the  temperature  variation  across  the  wafer  surface  will  be  less  than 
the  variation  across  the  entire  susceptor.  However,  by  similar  reasoning,  it  is  also 
intuitive  that  this  could  not  entirely  account  for  the  thickness  uniformity.  For 
example,  the  front,  rear,  and  side  thermocouples  are  all  the  same  distance  from 
the  center.  But  the  offsets  are  not  equal.  The  apparatus  symmetry  is  not  mirrored 
by  the  temperature  set-points.  Some  other  phenomena  must  be  playing  a  role. 

A  quantifiable  relationship  between  the  offsets  and  the  actual  temperature  held 
on  the  wafer  surface  is  unknown.  This  is  because  there  exists  no  reliable  method  for 
measuring  temperature  on  the  wafer  surface.  Poly-silicon  growth  rates  are  often 
used  as  a  sensitive  thermometer.  While  this  method  may  be  useful  for  measuring 
aggregate  temperature,  we  argue  here  that  it  is  hawed  for  measuring  a  tempera¬ 
ture  distribution  across  a  surface,  unless  one  can  guarantee  that  other  conditions 
across  the  surface  (e.g.,  reactant  concentrations)  are  perfectly  uniform.  The  other 
alternative  is  to  use  an  instrumented  wafer,  i.e.,  a  wafer  with  attached  thermocou- 
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pies.  However,  ASM  America  and  NG-ESSS  consider  measurements  taken  using 
the  instrumented  wafer  to  be  unreliable,  especially  while  operating  at  process  con¬ 
ditions  for  growth  [106,  128].  Nevertheless,  we  report  that  experiments  using  an 
instrumented  wafer  indicate  maximum  temperature  variations  of  5  degrees  C  or 
0.7%  [128].  This  non-uniformity  is  less  than  that  predicted  by  considering  just  the 
offsets  and  more  than  that  found  by  measuring  growth  rates.  In  that  respect,  it 
appears  to  fall  in  the  correct  range. 

Finally,  we  note  that  wafer  rotation  reduces  the  growth  rate  variation  from 
1.5%  to  less  than  1%  and  temperature  variations  as  recorded  by  the  instrumented 
wafer  from  5  C  to  1  C.  Therefore,  we  may  conclude  that  wafer  rotation  does  have 
the  intended  flattening  effect,  but  does  not  compensate  entirely  for  temperature 
non-  uniformity. 

We  have  demonstrated  anecdotally  that  thickness  and  growth  rate  uniformity 
in  the  thermally  activated  regime  is  achieved  by  setting  three  thermocouple  offsets 
so  that  the  temperature  distribution  across  the  susceptor  is  intentionally  non- 
uniform.  The  actual  relationships  among  offsets,  temperature,  and  growth  rate, 
and  the  other  factors  that  affect  them,  are  left  to  be  determined. 


5.3  Growth  Experiments 

In  this  section  we  describe  silicon  growth  experiments  performed  to  study  the 
relationship  between  deposition  rate  and  operating  conditions  such  as  temperature 
and  flow  rates.  The  experimental  data  is  used  later  as  a  basis  of  comparison  to 
validate  the  modeling  results. 

The  experiments  were  conducted  by  the  author  and  Mr.  Paul  Brabant  of  NG- 
ESSS  using  the  Epsilon-1  reactor  on  site  at  NG-ESSS.  In  each  experiment,  we 
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deposited  poly-Si  on  a  silicon  wafer  coated  with  a  layer  of  silicon-dioxide.  These 
experiments  are  suitable  for  determining  growth  rates  of  both  poly-Si  and  epi- 
Si,  due  to  the  fact  that  their  growth  rates  are  identical  (see  Section  5.2.2).  The 
process  gases  used  were  a  mixture  of  2%  SiH4  diluted  in  H2  as  the  silicon  source, 
together  with  20  slm  H2  as  the  carrier.  Measurements  of  him  thickness  were  taken 
using  the  available  nanospec  (Nanometrics  210  XP  Scanning  UV  Nanospec/DUV 
Microspectrophotometer) . 

The  main  objective  was  to  determine  the  relationship  between  growth  rate 
and  operating  conditions  such  as  temperature  and  how  rates,  including  finding  the 
unknown  parameters  (e.g.,  activation  energy)  of  an  assumed  Arrhenius  relationship 
between  wafer  temperature  and  deposition  rate.  The  relationships  were  studied 
under  a  range  of  typical  operating  conditions.  The  established  relationship  is 
used  later  for  validation  of  process-equipment  simulations  (Section  5.5)  and  lamp 
heating  models  (Section  6.3). 

Experimental  Procedure 

Thin  films  of  polycrystalline  silicon  were  deposited  from  the  silane  precursor  over 
a  hve  minute  period  at  a  pressure  of  20  Torr.  Deposition  was  performed  under  a 
combination  of  operating  conditions  consisting  of  four  different  wafer  temperatures 
and  three  different  silane  how  rates  (and  hence  three  different  silane  mole  fractions). 

Temperatures  were  set  in  the  surface-reaction  controlled  regime  so  that  growth 
would  be  thermally  activated.  This  regime  is  roughly  from  600  C  to  800  C  for 
deposition  of  silicon  from  silane  gas.  We  chose  the  following  wafer  temperatures 
at  which  to  deposit  silicon:  650  C,  700  C,  725  C,  and  750  C. 

Three  different  how  rates  were  used  for  the  2%  silane  in  hydrogen  precursor: 
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1.5  slm,  2.5  slm,  and  3.5  slm.  Considering  the  2%  dilution,  these  three  flow  rates 
correspond  to  30  seem,  50  seem,  and  70  seem  of  silane,  respectively.  The  silane- 
hydrogen  precursor  was  again  diluted  in  20  slm  of  the  carrier  hydrogen  (H2)  gas. 
Thus,  the  three  flow  rates  correspond  to  three  mole  fractions  1.4  xl0~3,  2.2  xlO-3, 
and  3.0  xl0~3,  respectively. 

The  reactor  was  operated  in  its  usual,  automatic  mode  (i.e.,  using  PID  control 
loops  for  temperature  regulation  and  wafer  rotation  for  uniformity),  using  pre¬ 
programmed  recipes.  Recipes  were  programmed  to  set  chamber  pressure  at  20  Torr 
and  to  deposit  silicon  from  silane  precursor  for  five  minutes  onto  the  bare  silicon 
wafers.  Film  thicknesses  were  measured  later  using  the  nanospec. 

Experimental  Results 

We  attempted  twelve  deposition  experiments  -  one  for  each  combination  of  the 
four  wafer  temperatures  (650,  700,  725,  750  C)  and  three  silane  flow  rates  (30,  50, 
70  seem).  Each  of  the  twelve  depositions  was  performed  on  a  different  wafer.  At 
650  C,  there  was  no  appreciable  deposition  for  any  of  the  flow  rates.  Thus,  these 
three  wafers  provided  no  data  for  analysis.  At  700  C  and  above,  enough  silicon  was 
deposited  so  that  measurements  could  be  taken.  Thickness  was  measured  using 
the  nanospec  at  five  different  points  on  the  wafer  surface  (see  [117]  for  the  raw 
data).  In  the  case  where  temperature  was  700  C  and  silane  flow  rate  was  30  sccnr, 
deposited  film  thickness  was  less  than  100  Angstroms  over  the  five  minute  period, 
the  minimum  readable  by  the  nanospec.  Hence,  growth  rate  was  recorded  as  less 
than  20  A/min.  The  data  is  presented  in  Table  5.1. 

A  model  that  is  useful  for  describing  deposition  kinetics  in  the  thermally  acti- 
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Poly-silicon  Deposition  Rate 
As  Function  Of  Temperature  and  Silane  Flow  Rate 


Process  Conditions 

Chamber  Pressure 

20  Torr 

Carrier  Gas 

20  slm  H2 

Source  Gas 

2%  SiH4  in  H2 

Purge  Gas 

7  slm  H2 

Growth  Rate  (A/min) 


Wafer 

Silane  Flow  Rate 

Temperature 

30  seem 

50  seem 

70  seem 

700  C 

<  20.00 

65.68 

80.08 

725  C 

73.12 

106.60 

138.72 

750  C 

118.28 

171.32 

216.68 

Table  5.1:  Measured  deposition  rate  (Angstroms  per  minute):  Five  minute  depo¬ 
sition;  three  wafer  temperatures;  three  silane  flow  rates. 


vated  regime  is  the  Arrhenius  relationship 

RSi  =  ko  exp  ASiH4  (5.1) 

where  i?si  denotes  deposition  rate,  k0  denotes  the  pre-exponential  constant,  Ea 
denotes  the  activation  energy,  Rg  denotes  the  gas  constant,  Tw  denotes  the  wafer 
temperature,  and  Asm,  denotes  the  silane  mole  fraction.  We  call  a  plot  of  the 
logarithm  of  deposition  rate  versus  inverse  temperature  an  Arrhenius  plot.  The 
Arrhenius  plots  associated  with  the  data  we  collected  are  shown  in  Figure  5.6. 

According  to  equation  (5.1),  the  slope  of  an  Arrhenius  plot  gives  the  activation 
energy  Ea  while  the  intercept  (along  with  knowledge  of  the  silane  mole  fraction) 
gives  the  pre-exponential  constant  k0.  Computed  parameters  are  given  in  Table  5.2. 
The  activation  energies  calculated  from  the  Arrhenius  plots  range  from  1.57  eV 
to  1.69  eV  depending  on  silane  mole  fraction.  This  range  is  very  close  to  the 
activation  energy  of  1.82  eV  determined  experimentally  by  the  manufacturer,  ASM 
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-3.6 


Arrhenius  Plot  for  Poly-Si  Deposition  Rate 


Figure  5.6:  Arrhenius  plots  for  silicon  deposition  from  silane  gas:  each  plot  repre¬ 
sents  log  of  deposition  rate  (microns  per  minute)  versus  inverse  absolute  temper¬ 
ature  for  one  of  the  three  silane  flow  rates  used. 

America  [105].  In  addition,  the  pre-exponential  constants  range  from  3.8  x  108 
to  1.85  x  109,  a  range  which  includes  the  value  of  7.9  x  10s  predicted  by  the 
manufacturer. 


5.4  Process-Equipment  Model 

This  section  motivates  and  describes  the  process-equipment  model  that  we  devel¬ 
oped  to  predict  process  behavior  (transient  and  steady-state)  and  product  char¬ 
acteristics.  We  loosely  describe  the  model  as  comprehensive  because  it  accounts 
for  a  wide  range  of  physical  and  chemical  mechanisms,  reactor  geometry,  mate¬ 
rial  properties,  and  the  effects  of  process  conditions  (pressure,  temperature,  flow 
rates,  and  gas  composition)  and  equipment  settings  (injector  sizes,  thermocouple 
offsets).  This  does  not  imply  that  the  model  represents  a  complete  description 
of  process-equipment  dynamics  (if  such  a  model  is  actually  possible).  Rather,  we 
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Parameters  For  Arrhenius  Relationship  Describing  Silicon  Deposition 

Kinetics 


Assumed  Relationship 


Rsi  =  k>  exp 


-En 


RgTw 


SiHA 


Symbol 

Description 

Data 

^rnix 

Silane/Hydrogen  Mixture  Flow  Rate  (slm) 

1.5 

2.5 

3.5 

ldsiH4 

Silane  Flow  Rate  (seem) 

30 

50 

70 

AsiH4 

Silane  Mole  Fraction  (xl0~3) 

1.4 

2.2 

3.0 

Ea 

Activation  Energy  (eV) 

1.69 

1.67 

1.57 

Activation  Energy  (J/mol)  (xlO5) 

1.63 

1.61 

1.51 

Ea/ Rg 

Ratio  (K)  (xlO4) 

1.96 

1.94 

1.82 

k0 

Pre-exponential  Constant  (um/min) 

(xlO9) 

1.85 

1.30 

0.38 

k0 

Pre-exponential  Constant  (cm/sec)  i 

(xlO3) 

3.08 

2.16 

6.54 

Table  5.2:  Parameters  calculated  by  fitting  experimental  data  to  an  assumed  Ar¬ 
rhenius  relationship  for  poly-Si  growth  rate  as  a  function  of  temperature. 


made  choices  so  that  the  importance  of  a  particular  effect  would  be  reflected  in 
the  model  fidelity. 


5.4.1  Modeling  Approach 

Models  for  silicon  growth  that  cannot  be  coupled  to  gas  phase  transport  phenom¬ 
ena  and  that  use  a  simplified  chemical  kinetics  model  are  inadequate  for  describ¬ 
ing  the  essential  physics  and  chemistry.  For  example,  initial  models  for  silicon 
growth  in  the  Epsilon-1  presented  in  [117]  considered  the  process-equipment  state 
to  be  completely  determined  by  a  1-dimensional  (radial)  wafer  temperature  pro¬ 
file.  Growth  rate  was  related  to  wafer  temperature  by  a  single  nonlinear  Arrhenius 
law.  As  stated  earlier,  this  approach  appears  often  in  the  literature  (motivated  by 
temperature  control  problems),  but  is  inadequate  for  our  purposes  here. 
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Models  incorporating  more  complete  descriptions  of  transport  phenomena,  chem¬ 
ical  mechanisms,  couplings,  2-dimensional  or  3-dimensional  spatial  effects,  and 
non-symmetric  geometries  have  been  appearing  recently  in  the  literature.  Authors 
have  approached  the  modeling  problem  based  on  their  specific  objectives,  process, 
and  equipment,  resulting  in  models  with  varying  levels  of  detail  and  breadth  of 
scope. 

In  [98]  a  dynamic  simulator  is  presented  which  predicts  the  time-dependent 
behavior  of  equipment,  process,  sensors,  and  control  systems  for  RTCVD  of  poly- 
Si  from  silane.  This  simulator  is  comprehensive  in  the  respect  that  it  provides 
the  capability  to  predict  aggregate  values  for  deposition  rate,  him  thickness,  tem¬ 
perature,  and  gas  how,  as  well  as  cycle  time,  consumables  volume,  and  reactant 
utilization.  However,  prediction  of  deposition  uniformity  requires  high  spatial  res¬ 
olution,  rather  than  aggregate  quantities,  so  their  approach  is  not  suitable  here. 

Axisymmetric  cylindrical  vertically  oriented  reactors  are  considered  in  [24,  49]. 
They  incorporate  coupled  effects  of  2-dimensional  gas  how,  mass  transport,  and 
heat  transfer  effects.  In  addition,  [49]  includes  thermal  diffusion  and  the  effect  of 
susceptor  rotation  on  the  gas  how.  These  models  consider  a  relatively  broad  scope 
of  effects  for  reactors  with  simple  geometries  that  can  be  analyzed  and  simulated  at 
high  resolution  in  two  spatial  dimensions.  They  do  not  include  models  of  chemical 
mechanisms  for  growth. 

Models  for  reactors  with  non-symmetric  geometries  that  require  consideration 
of  3-dimensional  effects  are  scarce.  This  is  mainly  due  to  the  significant  additional 
complexity  of  equations,  boundary  conditions,  and  solution  techniques,  along  with 
burdensome  computational  demands.  One  strategy  is  to  restrict  the  effort  to  one 
particular  effect  of  interest.  Two  such  models  for  commercial  RTCVD  chambers, 
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that  are  limited  to  heat  transfer  only,  including  complicated  surface-to-surface 
radiation,  are  presented  in  [78,  85]. 

Another  strategy  is  to  use  commercially  available  general  purpose  computa¬ 
tional  fluid  dynamics  (CFD)  codes  and  software  packages.  These  packages  pro¬ 
vide  the  necessary  tools  for  modeling  of  transport  phenomena  coupled  with  some 
chemical  mechanisms,  including  efficient  numerical  integration  schemes  and  3- 
dimensional  grid  generation  for  irregular  geometries.  The  CFD  approach  provides 
a  comprehensive  and  general  process-equipment  state  description. 

However,  there  are  drawbacks  to  using  general  purpose  CFD  packages.  There  is 
an  interface  layer  in  the  software  that  separates  the  user  from  the  underlying  com¬ 
puter  code  and  variables.  This  is  advantageous  for  setting  up  problems  but  makes 
it  difficult  to  use  CFD  code  in  control  loops  or  other  specialized  applications.  The 
general  purpose  nature  of  the  software  results  in  some  built-in  limitations  to  the 
level  of  accuracy  and  detail  that  can  be  achieved  in  modeling  specific  aspects  of  a 
particular  piece  of  equipment.  It  is  unclear  how  to  channel  computational  resources 
to  areas  in  accordance  with  their  importance,  or  to  deal  efficiently  with  phenom¬ 
ena  that  occur  at  vastly  different  spatial  and  temporal  scales.  CVD  applications 
present  special  challenges,  including  modeling  for  transport  of  mass  and  momen¬ 
tum  in  a  multicomponent  gas  mixture,  heat  radiation  with  spectral  dependence, 
and  surface  chemistry. 

Some  of  the  above  problems  related  to  CVD  applications  were  addressed  by 
the  ESPRIT  ACCESS-CVD  project  funded  by  the  European  Commission  to  de¬ 
velop  and  implement  a  CFD  code  specifically  designed  for  use  in  modeling  CVD 
processes.  The  project  resulted  in  a  commercial  code,  PHOENICS-CVD,  which 
makes  it  practical  to  include  many  of  the  important  effects  associated  with  CVD 
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processes.  It  consists  of  coupled  dynamic  sub-models  for  fluid  flow,  heat  trans¬ 
fer,  and  multicomponent  species  transport  in  the  gas  phase,  integrated  with  a 
model  for  conjugate  heat  transfer  among  lamps  and  other  solid  surfaces,  databases 
and  models  for  gas  phase  and  surface  chemistry  for  a  large  number  of  reactions, 
and  databases  and  models  for  determining  the  time-varying,  parameter  dependent 
transport,  thermodynamic,  and  optical  properties  of  the  involved  materials. 

The  PHOENICS-CVD  software  was  used  to  model  a  variety  of  CVD  reactors  in 
a  semiconductor  development  line  for  0.3  fim  CMOS  devices  as  presented  in  [163]. 
Most  importantly  for  our  purposes,  the  authors  demonstrated  the  capability  of 
PHOENICS-CVD  as  a  tool  for  investigating  uniformity  issues  in  reactors  with 
lion-symmetric  geometries. 

Given  manufacturing  objectives,  and  in  light  of  the  complicated  geometry  and 
operation  of  the  Epsilon-1  reactor,  we  implemented  Epsilon-1  reactor  simulations 
using  PHOENICS-CVD.  Figure  5.7  shows  a  general  overview  of  the  modeling 
framework.  For  a  detailed  exposition  on  the  various  aspects  of  this  type  of  model 
see  [82],  The  idea  is  to  produce  a  model  that  predicts  the  behavior  of  the  process 
chamber  block  shown  previously  in  Figure  5.4.  Process  recipe  inputs  and  equip¬ 
ment  settings  enter  the  model  via  material  parameters,  boundary  conditions  on 
transport  variables,  and  geometric  construction  of  the  solution  grid.  We  note  that 
even  using  the  powerful  PHOENICS-CVD  tool,  high-fidelity  models  that  include 
most  or  all  of  the  desired  features  previously  described  is  an  immensely  time  con¬ 
suming  undertaking.  For  this  reason,  various  simplifications  are  still  employed, 
which  are  described  in  the  sequel  as  they  are  encountered. 
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Figure  5.7:  Overview  of  modeling  framework.  Process-equipment  state  compo¬ 
nents  (gas  flow,  heat  transfer,  species  transport)  are  coupled  to  each  other,  material 
properties,  and  chemical  mechanisms. 

5.4.2  Process-Equipment  State 

It  has  become  apparent  that  spatial  uniformity  of  deposition  rate  and  him  thick¬ 
ness  is  influenced  by  several  variables,  not  limited  to  wafer  temperature,  even  for 
thermally  driven  processes.  These  variables  are  included  in  what  we  refer  to  as 
the  process- equipment  state,  which  is  the  time- varying  spatial  distribution  of  how 
velocity,  temperature,  and  species  concentrations  throughout  relevant  portions  of 
the  reactor.  Table  5.3  lists  the  essential  variables.  The  time  evolution  and  steady 
behavior  of  the  process-equipment  state  is  determined  by  the  physical  and  chemi¬ 
cal  mechanisms  of  the  CVD  process,  reactor  geometry,  material  properties,  recipe 
inputs,  and  equipment  settings.  The  components  of  the  process-equipment  state 
interact  with  each  other,  the  recipe  inputs,  and  the  equipment  settings  in  a  complex 
manner. 

The  process-equipment  state  is  manifested  in  certain  macroscopic  phenomena 
that  we  believe  have  a  signihcant  influence  on  deposition  uniformity.  This  is  mainly 
due  to  the  fact  that  they  contribute  to  non-uniformity  of  reactant  concentration 
profiles  at  or  near  the  wafer  surface.  We  wish  to  study  these  phenomena  using  the 
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Essential  Variables  Comprising  Process-Equipment  State 


Symbol 

Description 

Gas  Phase  Transport  Variables 

V 

Velocity  Vector  of  Gas  Mixture 

P 

Pressure 

T 

1a 

Temperature  of  Gas  Mixture 

UJi 

Mass  Fraction  of  i-th  Species  in  Gas  Mixture 

A 

Total  Diffusive  Mass  Flux  Vector  of  i-th  Species  in  Gas  Mixture 

Gas  Phase  Thermal  Properties 

P 

Density  of  Gas  Mixture 

Cp 

Specific  Heat  Capacity  of  Gas  Mixture 

kc 

Thermal  Conductivity  of  Gas  Mixture 

Conjugate  Heat  Transfer  Variables 

Tw 

Temperature  of  Wafer 

Twa 11 

Temperature  of  Chamber  Wall 

Thermal  Properties  of  Solids 

Pw 

Density  of  Wafer 

CPw 

Specific  Heat  Capacity  of  Wafer 

h 

A Jw 

Thermal  Conductivity  of  Wafer 

Table  5.3:  Variables  and  material  parameters  comprising  the  process-equipment 
state.  Dependencies  on  space  and  time  have  been  suppressed. 
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process-equipment  model.  Here  we  describe  some  of  the  important  effects  that  we 
focused  on  in  the  modeling  effort. 


Reactant  Depletion 

As  precursor  gases  flow  across  the  wafer  surface,  reactants  are  deposited,  causing 
a  gradual  downstream  reduction  in  their  gas  phase  concentration.  Thus,  down¬ 
stream  portions  of  the  wafer  may  be  subject  to  lower  concentrations  of  impinging 
reactants,  and  hence  the  growth  rate  may  be  lower  there.  The  magnitude  of  the 
depletion  effect  varies  depending  upon  process  conditions.  The  degree  to  which 
wafer  rotation  compensates  for  reactant  depletion  is  not  accurately  known. 

Nonuniform  Gas  Heating  and  Gas  Phase  Reactions 

Based  on  experimental  data,  gas  phase  reactions  appear  to  be  important  in  CVD 
processes,  except  for  those  under  very  low  pressure  (see  [82]  pp.  134).  For  example, 
at  atmospheric  pressure,  growth  rate  of  silicon  from  silane  is  strongly  influenced 
by  dissociative  deposition  of  intermediate  species  formed  in  the  gas  phase.  Fur¬ 
thermore,  gas  phase  reaction  rates  can  be  strongly  dependent  on  temperature. 
Typically,  the  gases  heat  up  as  they  pass  over  the  susceptor  and  wafer  in  the  pro¬ 
cess  chamber.  This  may  cause  a  gradual  downstream  increase  in  gas  phase  reaction 
rates.  Thus,  downstream  portions  of  the  wafer  may  be  subject  to  higher  concen¬ 
trations  of  impinging  reactants.  The  overall  effect  depends  on  the  gas  composition 
and  process  conditions. 
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Thermal  Diffusion  of  Species 


The  gas  species  in  an  initially  homogeneous  gas  mixture  will  separate  under  the 
influence  of  a  temperature  gradient  (see  [82]  pp.  110).  Large,  heavy  molecules 
(e.g.,  silane)  diffuse  toward  colder  regions,  whereas  small,  light  molecules  (e.g., 
hydrogen)  diffuse  toward  hotter  regions.  Usually,  the  effect  is  small  compared 
with  ordinary  concentration  driven  diffusion.  However,  due  to  the  large  thermal 
gradients  in  the  cold- wall  Epsilon-1  (e.g.,  300  C  difference  between  wafer  and  walls), 
thermal  diffusion  may  have  a  significant  effect.  Thus,  reactant  concentration  may 
be  higher  where  the  gas  is  cooler,  e.g.,  upstream  or  in  the  lower  section  of  the 
chamber.  Reduction  in  growth  rate  by  20%  to  30%  caused  by  thermal  diffusion 
has  been  observed  in  RTCVD  chambers  (see  [82]  pp.  164).  Thermal  diffusion  is 
sometimes  referred  to  as  the  Soret  effect. 

Flow  Patterns 

Calculations  in  [117]  indicate  a  Reynolds  number  of  approximately  27  for  gas  flow 
in  the  Epsilon-1.  Thus,  the  flow  is  laminar,  except  possibly  in  and  very  close  to  the 
injector  nozzles.  Nevertheless,  the  flow  may  have  some  interesting  characteristics 
that  have  an  impact  on  deposition  uniformity.  Recirculation  cells  due  to  buoyancy 
effects  are  believed  to  occur  in  virtually  all  rapid  thermal  CVD  (RTCVD)  cham¬ 
bers  due  to  the  large  thermal  gradients  present  (see  [23]  pp.  339).  Furthermore, 
three  types  of  natural  convection  rolls  are  typically  observed  in  horizontal  CVD 
chambers:  steady  longitudinal,  unsteady  transversal,  and  steady  transversal  at  the 
leading  edge  of  the  heated  susceptor  (see  [82]  pp.  162). 

Remark  5.4.1  For  each  of  the  above  effects,  the  relationship  between  it,  the  process- 
equipment  state,  recipe  inputs,  equipment  settings,  and  thickness  uniformity  is  not 
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well  understood.  Moreover,  it  is  not  well  understood  how  to  compensate  for  non¬ 
uniformity  in  species  concentrations  caused  by  these  effects.  The  setting  of  gas 
injector  opening  sizes  and  thermocouple  offsets  to  minimize  these  effects  and  to 
produce  uniform  thickness  is  done  iteratively,  usually  requiring  approximately  five 
test  recipes.  This  modeling  effort  is  a  first  step  toward  understanding  these  rela¬ 
tionships  and  developing  a  model-based  systematic  compensation  method.  □ 

5.4.3  Reactor  Geometry  and  Finite  Volume  Mesh 

The  non-symmetric  geometry  of  the  Epsilon-1  necessitates  genuine  3-dimensional 
modeling  of  transport  phenomena  in  the  process  chamber.  We  adopt  a  Cartesian 
(. x-y-z )  coordinate  system,  since  the  lenticular  chamber  can  be  modeled  roughly 
as  long  thin  box  with  polygonal  or  curved  sides. 

We  refer  to  the  direction  of  flow  from  front  to  rear  as  the  z  direction,  the 
bottom  to  top  direction  (perpendicular  to  the  wafer)  as  the  y  direction,  and  the 
left  side  to  right  side  direction  (looking  in  through  front)  as  the  x  direction.  These 
coordinates  are  natural  and  convenient  for  chamber  modeling  but  not  for  modeling 
the  cylindrical  wafer  and  susceptor,  whose  geometries  must  then  be  approximated. 

PHOENICS  uses  a  finite-volume  mesh  as  the  discretization  of  the  spatial  do¬ 
main.  Figure  5.8  shows  a  view  of  the  overall  mesh  we  developed  for  modeling  the 
Epsilon-1  process  chamber.  The  mesh  is  body-fitted  and  has  dimensions  of  25  by 
27  by  52  volume  elements  in  the  x,  y,  and  z  directions,  respectively. 

The  Epsilon- 1  apparatus  set-up  and  gas  flows  are  not  symmetric  in  the  y-  and 
z-  direct  ions.  Although  the  exterior  geometry  appears  to  be  y-symmetric,  there  are 
significant  differences  between  upper  and  lower  chamber  sections.  The  chamber 
does  have  x-symmetry,  with  the  center  y-z  plane  serving  as  a  symmetry  plane 
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Figure  5.8:  Overall  body-fitted  25  x  27  X  52  finite  volume  mesh  for  modeling  the 
Epsilon- 1  lenticular  chamber.  Solid  cut-away  figure  at  right  is  a  viewing  aide  to 
show  the  full  geometry  of  the  chamber,  but  not  part  of  the  mesh.  Inlet  side  faces 
viewer. 

about  which  the  geometry  and  values  for  all  variables  are  mirrored  exactly.  The 
shaded  portion  on  the  right  side  of  Figure  5.8  is  not  part  of  the  actual  mesh,  but 
rather  a  viewing  aide  to  show  a  portion  of  the  overall  chamber  geometry.  Only  the 
left  half  of  the  chamber  is  modeled. 

Figure  5.9  shows  a  top  view  of  the  x-z  mid-plane  level  with  the  wafer  surface. 
The  surface  geometry  of  the  wafer,  susceptor,  and  ring  have  been  approximated  by 
rectangular  sections.  It  is  possible  to  approximate  the  curved  surfaces  more  accu¬ 
rately,  either  with  additional  rectangular  volume  elements  arranged  appropriately, 
or  with  irregularly  shaped  volume  elements.  However,  irregularly  shaped  volume 
elements  caused  computational  difficulties,  and  construction  of  the  disk  shape  from 
regular  elements  required  a  large  number  of  additional  mesh  elements  in  areas  that 
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were  not  of  particular  interest.  These  drawbacks  offset  any  advantages  gained  from 
improving  the  geometrical  accuracy  of  the  wafer. 

Figure  5.10  shows  a  side  view  of  the  center  y-z  plane.  The  upper  and  lower 
chamber  sections  can  be  identified,  respectively,  above  and  below  the  wafer.  Gases 
can  flow  and  species  can  diffuse  between  the  upper  and  lower  chamber  sections 
through  thin  gaps  between  the  quartz  shelf  and  the  ring,  and  between  the  ring  and 
susceptor.  We  model  only  the  shelf-ring  gap  since  it  is  significantly  wider  than  the 
ring-susceptor  gap,  and  assume  that  it  accounts  for  all  interaction  between  upper 
and  lower  sections. 


Front  Shelf  Wafer  Suscep  Ring  Rear  Shelf 

Gap 


u - z 

Figure  5.9:  Top  view  of  finite  volume  mesh  at  x-z  mid-plane  level  with  wafer 
surface. 


b — z 


Figure  5.10:  Side  view  of  finite  volume  mesh  at  y-z  mid-plane  which  serves  as  a 
symmetry  plane. 

In  addition  to  the  chamber  model,  we  have  also  developed  a  simplified  3- 
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dimensional  finite-volume  mesh  for  the  inlet  flange.  This  mesh  uses  three  thin 
gaps  in  a  solid  surface  to  model  the  three  injector  slits.  The  gas  mixture  flows 
vertically  downward  from  an  inlet  opening  through  the  slits  until  it  reaches  the 
bottom  surface  of  the  inlet  flange,  at  which  point  it  is  forced  to  make  a  perpen¬ 
dicular  change  of  direction  toward  the  chamber  entrance.  The  idea  is  to  simulate 
the  effects  that  the  injector  slits  and  the  injector  flange  geometry  has  on  the  flow. 
For  example,  we  are  interested  in  seeing  how  the  forced  change  of  direction  creates 
possible  swirling  effects,  and  how  varying  injector  slit  widths  affects  the  flow  pro¬ 
file  as  gases  enter  and  flow  through  the  chamber.  This  model  is  separate  from  the 
chamber  model,  which  assumes  a  uniform  flow  profile  at  the  inlet  to  the  chamber. 

5.4.4  Transport  Phenomena 

The  process-equipment  state  is  determined  by  the  transport  of  mass,  momentum, 
and  heat  energy  in  the  process  and  purge  gases,  and  heat  energy  in  and  among  the 
solids  that  comprise  the  chamber  walls,  shelves,  ring,  susceptor,  and  wafer.  The 
various  effects  are  coupled  through  transport  equations,  state  dependent  material 
parameters,  and  boundary  conditions.  We  provide  here  a  brief  overview  of  the 
main  assumptions  and  equations  used  in  the  PHOENICS-CVD  transport  models 
and  that  we  used  in  particular  for  modeling  silicon  growth  in  the  Epsilon-1.  Further 
details  can  be  found  in  [82,  83]. 

Assumptions 

The  basic  assumptions  regarding  the  gas  mixture  are  that  it  behaves  as  a  contin¬ 
uum,  is  an  ideal  gas,  and  is  transparent  to  infrared  heat  radiation.  In  addition,  the 
flow  is  assumed  to  be  laminar  and  the  effects  of  viscous  heating  and  pressure  varia- 
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tions  on  the  gas  temperature  is  neglected.  These  assumptions  are  widely  applicable 
to  CVD  systems  and  in  particular  are  not  limiting  for  modeling  the  Epsilon- 1. 

We  also  made  assumptions  regarding  boundary  conditions  that  are  specific  to 
modeling  the  Epsilon-1.  The  flow  profile  at  the  entrance  to  the  chamber  is  assumed 
to  be  a  uniform  flow  velocity  in  the  direction  normal  to  the  entrance.  All  solids 
in  the  chamber  are  considered  isothermal,  i.e.,  constant  temperature  within  each 
individual  piece  of  apparatus  and  throughout  the  entire  wafer.  Chamber  walls  are 
assumed  to  be  no-slip  and  stationary,  even  though  in  reality  there  are  moving  parts 
in  the  process  chamber.  We  assume  that  the  top  surface  of  the  wafer  is  the  only 
surface  on  which  chemical  reactions  occur.  These  assumptions  can  limit  the  scope 
of  the  predictive  capability  of  the  model.  However,  we  believe  that  they  do  not 
seriously  degrade  model  fidelity  regarding  prediction  of  steady-state  phenomena, 
so  long  as  they  are  accounted  for  in  any  investigation  of  the  factors  that  influence 
uniformity.  The  assumptions  are  discussed  further  in  Section  5.4.6. 


Gas  Phase  Transport 


We  now  give  the  basic  transport  equations  for  an  TV-component  reacting  gas  mix¬ 
ture  with  K  gas  phase  reactions.  Gas  flow  in  the  reactor  is  governed  by  the  familiar 
conservation  equations  for  mass  and  momentum,  i.e.,  the  continuity  equation 

%  =  v  .  (pv)  (5.2) 


and  the  Navier-Stokes  equation 


d(pv) 


=  -V 


inertial 


(pvv)- |-V  ■  VP 

v  pressure 


Pg 

gravity 


(5,3) 


where  p  is  the  gas  density,  v  is  the  gas  velocity,  P  is  the  pressure,  and  g  is  gravity. 
The  viscous  stress  tensor  r  for  a  Newtonian  fluid  such  as  the  gas  mixture  in  a  CVD 
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reactor  takes  the  form 


2 

z  =  p  (v  v  +  (' V  iz)T)  —  -  p  (V  •  u)  •  JI 


where  p  is  the  dynamic  viscosity  of  the  gas.  For  CVD  applications,  the  density  p 
and  viscosity  p  are  strongly  dependent  on  the  temperature,  pressure,  and  mixture 
composition.  For  this  reason,  the  gas  flow  equations  are  strongly  coupled  to  the 
equations  for  transport  of  heat  energy  and  species  concentrations.  In  particular, 
temperature  and  concentration  gradients  cause  variations  in  gas  mixture  density 
which  are  manifested  in  buoyancy  effects. 

Transport  of  heat  energy  in  the  reactor  is  governed  by  the  familiar  heat  equa¬ 
tion,  with  additional  terms  to  account  for  effects  that  occur  in  chemically  reacting 
multicomponent  gases.  In  particular,  heat  is  generated  and  consumed  by  the  inter¬ 
diffusion  of  different  species  and  by  the  various  gas  phase  chemical  reactions.  Also, 
heat  can  flow  due  to  the  presence  of  a  concentration  gradient,  which  is  referred  to 
as  the  Dufour  effect.  The  conservation  equation  for  gas  temperature  is  given  by 


,  d(pTa) 

g  at 

transient 


/  N  dt 

V  •  (kc  V  Tg)  +  cp  V  •  (pvTg)  +  V  •  [RgTg  £  —  V  (In/*) 

' - - - ' - » - "  V  i=l 

conduction  convection  ^  v 

Dufour 

N  TT  N  K 

i=  1  Uil  i=l  k= 1 


inter-diffusion 


where  cp  is  the  specihc  heat  capacity  per  unit  mass  of  the  gas,  Tg  is  the  gas  tem¬ 
perature,  kc  is  the  gas  thermal  conductivity,  and  Rg  is  the  universal  gas  constant. 
Associated  with  the  i- th  gas  species  is  the  mole  fraction  /$,  molar  mass  rn, .  ther¬ 
mal  diffusion  coefficient  Df,  molar  enthalpy  Ht,  and  total  diffusive  mass  flux  j .. 
The  stoichiometric  coefficient  of  the  i-th  species  in  the  k-th  gas  phase  reaction  is 
denoted  uik  with  forward  reaction  rate  R9k  and  reverse  reaction  rate  R9_k. 
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PHOENICS-CVD  ignores  the  Dufour  effect  since  it  has  been  found  to  be  very 
small  in  CVD  systems.  The  density,  viscosity,  thermal  conductivity,  specific  heat 
capacity,  and  thermal  diffusion  coefficient  are  dependent  on  temperature  and  gas 
mixture  composition.  For  this  reason,  the  heat  transfer  equation  is  strongly  coupled 
to  the  gas  flow  and  species  concentration  equations. 

Gas  species  transport  in  the  reactor  is  governed  by  a  familiar  diffusion-convection 
equation  with  an  additional  source  term  to  account  for  the  creation  and  destruc¬ 
tion  of  species  due  to  K  reversible  chemical  reactions.  The  balance  equation  for 
the  concentration  of  the  i- th  gas  species  is  given  by 
d(pui) 


dt_ 

transient 


K 

-V  •  (pvwj)  -  V  •  ii  +  miY^  vik  (R{  -  R9_k 


convection 


diffusion 


k= 1 


(5,6) 


reactions 

In  the  above,  the  concentration  of  the  i-th  gas  species  is  a  dimensionless  mass 
fraction 


Pi 

UOi  =  — 

p 


(5,7) 


There  are  N  —  1  independent  species  concentration  equations  of  the  form  (5.6) 
since  the  mass  fractions  must  sum  to  1,  i.e., 


N 


I>  =  i 


(5.8) 


i=  1 


The  diffusive  mass  fluxes  are  defined  by 


i,  =  pu,  (v,  -s>) 


(5,9) 


with  respect  to  the  mass  averaged  velocity 


and  satisfy 


N 

V  =  Y1  UiVi 

i= 1 


N 

Ei  =  0 

i=  1 


(5,10) 


(5.11) 
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again  leaving  N  —  1  independent  variables. 

Gas  species  diffusion  is  caused  by  concentration  gradients,  which  we  refer  to 
as  ordinary  diffusion,  and  by  temperature  gradients,  which  we  refer  to  as  thermal 
diffusion,  or  the  Soret  effect.  It  is  expressed  as  the  sum  of  these  two  components 

k  =  £  +  £  (5-12) 

where  ?'c  and  jT  denote  the  concentration  driven  and  thermally  driven  diffusive 
fluxes,  respectively.  The  ordinary  diffusive  mass  fluxes  can  be  computed  via  Fick’s 
law,  the  Wilke  approximation,  or  the  full  Stefan-Maxwell  equations,  depending 
upon  the  properties  of  the  gas  mixture,  the  desired  degree  of  fidelity,  and  the 
available  computational  resources.  The  Stefan-Maxwell  formulation  is  given  by 

Va)j  +  u>i  V  (In  m 

with  m  the  average  mole  mass  of  the  mixture.  The  diffusive  mass  fluxes  due  to 
thermal  diffusion  are  given  by 

£  = -Df  V  (lniy  (5.14) 

where  the  thermal  diffusion  coefficient  Df  for  each  species  is  a  function  of  temper¬ 
ature  and  gas  mixture  composition.  In  general  Df  >  0  for  large,  heavy  molecules 
and  Df  <  0  for  small,  light  molecules,  resulting  in  the  observed  separation  of 
species  due  to  thermal  gradients. 

The  last  term  in  the  species  concentration  equation  (5.6)  represents  the  creation 
and  destruction  of  the  z-th  species  due  to  homogeneous  gas  phase  reactions.  The 
forward  and  reverse  reaction  rates  are  given  by 

(JZ>  £  \  Wik  I 

5t4  (5-15) 

rig  1  J 
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Wf)  (5'16) 
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where  kg  and  k-g  are  the  forward  and  reverse  reaction  rate  constants,  and  P  the 
total  pressure. 

Boundary  Conditions 

For  each  of  the  gas  phase  transport  equations  there  is  an  associated  set  of  boundary 
conditions,  which  prescribe  the  state  (or  associated  flux)  at  the  inlet,  outlet,  cham¬ 
ber  walls,  and  chamber  apparatus  including  susceptor  and  wafer.  The  boundary 
conditions  for  temperature  and  species  concentrations  are  responsible  for  coupling 
the  gas  phase  transport  phenomena  to  heat  transfer  in  the  solids  and  wafer  surface 
chemical  reactions,  respectively.  We  elaborate  further  below. 

For  each  inlet  boundary  (process  and  purge),  we  prescribe  the  inflow  velocity 
of  the  gas  mixture  normal  to  the  inflow  opening,  and  the  mass  fraction  for  each 
of  the  gaseous  species  (e.g.,  silane  and  hydrogen).  The  values  are  set  according  to 
the  process  recipe  we  wish  to  simulate.  The  temperature  of  the  gas  mixture  at  the 
inlet  is  set  to  room  temperature.  There  is  no  species  diffusion  through  the  inlet. 
These  conditions  are  given  by 

R-v  =  vin,  nxv  =  0,  T  =  Troom  ,  ut  =  ui>in,  n  ■  j .  =  0  (5.17) 

where  n  is  the  unit  vector  normal  to  the  inlet  opening. 

For  the  outlet  boundary,  we  impose  zero  gradient  conditions  for  all  variables. 
These  conditions  are  given  by 

n  ■  (V  (pv))  —  0  ,  nxv  =  0,  n  ■  (kc  V  Tg)  =  0  ,  n  ■  j_ .  —  0  (5.18) 

where  n  is  the  unit  vector  normal  to  the  outlet  opening. 

Boundary  conditions  at  the  solid-gas  interfaces  can  be  more  complicated,  mainly 
due  to  chemically  reacting  surfaces  and  heat  transfer  in  the  solids.  First  we  consider 
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non-reacting  surfaces,  e.g.,  chamber  walls,  quartz  shelves,  ring,  and  susceptor.  At 
these  surfaces,  the  no-slip  and  impermeability  conditions  apply,  i.e.,  flow  velocities 
are  set  to  zero.  Also,  the  total  mass  flux  normal  to  each  non-reacting  surface  must 
be  zero  for  each  of  the  species.  Note,  however,  that  due  to  thermal  diffusion,  the 
concentration  gradients  normal  to  the  surface  will  generally  not  be  zero.  These 
conditions  are  given  by 

v  —  0  ,  n-j_.  =  0  (5.19) 

where  n  is  the  unit  vector  normal  to  the  non-reacting  surface. 

For  a  reacting  surface,  which  in  our  case  refers  only  to  the  top  side  of  the 
wafer,  there  is  a  net  mass  production  rate  for  each  gaseous  species.  The  velocity 
component  normal  to  the  surface  is  proportional  to  this  rate,  while  the  tangential 
component  is  zero.  Furthermore,  the  total  mass  flux  normal  to  the  reacting  surface 
is  set  equal  to  the  production  rate.  These  conditions  for  a  process  with  L  surface 
reactions  are  given  by 

n-  v  =  -  ^2  rrii  ^2  an  Rsi  ,  n  x  v  =  0  ,  n-  (putv  +  j .)  =  m,  22  au  R\  (5.20) 
P  i= 1  1= 1  1=1 

where  n  is  the  unit  vector  normal  to  the  reacting  surface,  an  is  the  stoichiometric 
coefficient  for  the  i- th  gas  species  in  the  /-th  surface  reaction,  and  Rf  is  the  reaction 
rate  for  the  /-th  surface  reaction.  The  surface  reaction  rate  is  equal  to  the  product 
of  the  collision  rate  of  molecules  with  the  wafer  surface  and  the  reaction  probability, 
called  the  reactive  sticking  coefficient  (RSC). 

Thermal  boundary  conditions  at  the  solid-gas  interfaces  can  be  complex  due 
to  heat  transfer  within  and  among  the  various  solids  in  the  reactor.  This  includes 
the  effects  of  conduction  within  the  solids,  convective  losses  to  the  gas  phase,  and 
radiative  transfer  among  the  various  surfaces.  Heat  radiation  supplied  by  the  lamps 
is  especially  important. 
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PHOENICS-CVD  provides  the  capability  for  modeling  heat  transfer  in  the 
solids  and  coupling  these  effects  to  the  gas  phase  transport  phenomena  via  bound¬ 
ary  conditions.  Surface-to-surface  radiation  is  modeled  using  viewfactor  methods. 
However,  due  to  the  extremely  complicated  geometry  of  the  Epsilon-1  lamp-house 
and  reflector  apparatus,  and  the  very  large  number  of  solid  surfaces  with  varying 
optical  properties  in  the  process  chamber,  we  found  the  PHOENICS-CVD  radia¬ 
tion  modeling  tool  to  be  impractical  for  our  purposes.  This  is  discussed  further  in 
Section  5.4.6. 

Instead  of  modeling  heat  transfer  in  the  solids,  we  assumed  that  the  wafer 
and  susceptor  were  at  a  constant  uniform  temperature,  and  used  anecdotal  and 
experimental  data  from  the  manufacturer  to  estimate  the  temperature  on  other 
surfaces.  The  boundary  conditions  are  given  by 

Tg  =  Tsurf  (5.21) 

for  the  case  where  the  gas-solid  interface  is  a  isothermal  surface  and 

n-VTs  =  0  (5.22) 

when  there  is  an  adiabatic  surface. 

Specific  values  for  the  boundary  conditions  were  set  according  to  the  process 
recipes  that  we  were  simulating.  Values  are  provided  as  simulations  are  described 
in  Section  5.5. 

Material  Properties 

Models  for  describing  the  dependence  of  material  properties  on  the 
process-equipment  state  are  presented  in  [82,  83]  and  included  in  the  PHOENICS- 
CVD  software.  Furthermore,  PHOENICS-CVD  provides  databases  containing  any 
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necessary  parameters  for  determining  the  transport,  thermodynamic,  and  optical 
properties  of  most  materials  commonly  used  in  CVD  processes. 

Transport  properties  of  the  gases  include  viscosity,  thermal  conductivity,  and 
ordinary  and  thermal  diffusion  coefficients.  Their  dependence  on  temperature, 
pressure,  and  gas  mixture  composition  is  determined  using  the  Lennard-Jones  po¬ 
tential  and  kinetic  theory.  Lennard-Jones  parameters  for  the  individual  gases  are 
provided  in  a  database.  Properties  of  the  gas  mixture  are  calculated  from  the 
individual  gas  properties.  For  example,  a  semi- empirical  relationship  is  employed 
for  determining  mixture  viscosity. 

Thermodynamic  properties  of  the  gases  include  specific  heat  capacity,  standard 
heat  of  formation,  and  standard  entropy.  These  properties  are  given  as  functions 
of  temperature  via  polynomial  approximations,  with  a  different  polynomial  for 
each  of  three  temperature  ranges.  Polynomial  coefficients  for  individual  gases  are 
provided  in  a  database.  Again,  properties  of  the  gas  mixture  are  calculated  from 
the  individual  gas  properties.  For  example,  density  is  defined  in  terms  of  the  mean 
molecular  mass  and  specific  heat  is  defined  as  the  mass  averaged  value. 

Optical  properties  of  the  solids  include  refractive  indices  and  absorption  coef¬ 
ficients.  The  temperature  dependence  of  these  properties  in  each  of  60  spectral 
intervals  is  provided  in  a  database.  However,  as  stated  earlier,  we  did  not  use  the 
PHOENICS-CVD  surface-to-surface  radiation  model,  so  the  optical  properties  of 
the  solids,  e.g.,  quartz,  do  not  play  a  role  in  our  simulations. 
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5.4.5  Chemical  Mechanisms  for  Growth 


In  Section  5.3  we  presented  the  results  of  growth  experiments  and  showed  that, 
for  a  range  of  operating  conditions,  a  simple  Arrhenius  law  given  by 

RSi  =  ko  exp  (ij-jr)  xsm  (5.23) 

provides  an  accurate  model  for  predicting  growth  rate  as  a  function  of  wafer  tem¬ 
perature.  Parameters  such  as  activation  energy  were  calculated  by  fitting  the 
experimental  data  to  the  model. 

There  are  several  assumptions  and  simplifications,  both  explicit  and  implicit, 
in  the  above  Arrhenius  model.  It  assumes  that  silicon  growth  is  almost  completely 
due  to  the  heterogeneous  decomposition  of  silane  into  silicon  and  hydrogen  on  the 
wafer  surface.  Thus,  it  models  only  a  single  surface  reaction  step.  Furthermore,  it 
is  implicitly  assumed  that  inlet  conditions  for  silane  mole  fraction  hold  constant 
throughout  the  process  chamber.  This  allows  for  a  separation  of  the  factor  mul¬ 
tiplying  the  exponential  term  into  a  mole  fraction  variable  and  a  pre-exponential 
constant.  The  result  is  that  growth  rate  is  assumed  to  be  dependent  entirely  on 
two  process  recipe  inputs:  wafer  temperature  set-point  and  silane  mole  fraction 
at  the  inlet;  and  two  process  dependent  physical- chemical  parameters:  activation 
energy  and  pre-exponential  constant. 

The  above  approach  takes  the  view  that  surface  reactions  are  dominant  and  gas 
phase  reactions  are  negligible.  However,  it  was  demonstrated  by  Coltrin  and  co¬ 
workers  [31]  that  as  chamber  pressure  increases,  gas  phase  reactions  play  a  greater 
role.  They  showed  that  at  atmospheric  pressure,  silicon  growth  may  be  almost 
completely  due  to  reactive  intermediaries  formed  in  the  gas  phase. 

Kleijn  develops  a  model  for  gas  phase  and  surface  chemistry  in  [81]  for  tem¬ 
peratures  and  pressures  in  an  intermediate  range,  near  the  conditions  at  which 
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NG-ESSS  deposits  silicon  in  the  Epsilon- 1.  It  is  a  relatively  closed  subsystem  of 
the  full  kinetic  model  that  was  used  by  Coltrin  and  co-workers.  The  key  reaction 
is  the  homogeneous  decomposition  of  silane  which  leads  to  the  formation  of  sily- 
lene  (SiH2),  and  hydrogen.  Further  reactions  produce  disilane  (Si2H6),  trisilane 
(SRIis),  and  silylsylene  (Si2H4).  The  five  step  gas  phase  reaction  mechanism  is 
given  by 


SiH4  4  SiH2  +  H2 

(5.24) 

Si2H6  4  SiH4  +  SiH2 

(5.25) 

S^Hs  — >  Si2H6  +  SiH2 

(5.26) 

Si2H4  4  SiH2  +  SiH2 

(5.27) 

Si2Hg  — >  Si2H4  +  H2 

(5.28) 

each  of  which  has  an  associated  forward  and  reverse  reaction  rate  constant,  re¬ 
spectively,  kg  and  k-g.  For  silicon  growth  at  the  wafer  surface  from  silane  and  the 
reactive  intermediaries,  Kleijn  uses  a  set  of  five  surface  reactions  given  by 


SiH4(g)  — >  Si(s)  +  2H2(g) 

(5.29) 

SiH2(g)  ->•  Si(s)  +  H2(g) 

(5.30) 

Si2H6(g)^2Si(s)  +  3H2(g) 

(5.31) 

SisH8(g)  — >  3Si(s)  +  4H2(g) 

(5.32) 

Si2H4(g)  — >  2Si(s)  +  2H2(g) 

(5.33) 

each  of  which  has  an  associated  RSC.  We  refer  to  the  reaction  schemes  (5.24)-(5.28) 
together  with  (5.29)-(5.33)  as  the  Kleijn  model  for  poly-Si  deposition. 

The  reaction  rate  constants  and  RSCs  in  the  Kleijn  model  are  derived  from 
studies  by  various  investigators.  The  gas  phase  forward  and  reverse  reaction  rate 
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constants  are,  in  general,  temperature  and  pressure  dependent,  given  by  the  ex¬ 
pression 

kg  =  A  PK  exp  [yy  1  (5-34) 

where  parameters  A  and  Ea  were  fitted  to  experimental  data  for  temperatures  from 
300-1100  K  and  pressures  from  10-100  Torr.  Furthermore,  each  RSC  is  given  by 
a  different  complicated  function  of  wafer  and  gas  temperature.  Thus,  the  reaction 
scheme  includes  a  complicated  temperature  dependence  and  is  a  function  of  a 
large  number  of  physical-chemical  parameters,  e.g.,  multiple  activation  energies 
and  multiple  sticking  coefficients. 

PHOENICS-CVD  provides  the  necessary  tools  to  implement  the  Kleijn  model, 
including  a  database  of  experimentally  determined  kinetics  parameters.  Thus, 
we  used  the  Kleijn  model  to  describe  poly-Si  chemical  reaction  kinetics  in  the 
Epsilon-1.  In  contrast  to  the  initial  simplified  Arrhenius  models,  the  chemistry 
model  is  coupled  to  the  gas  phase  transport  model,  since  gas  phase  reactions  play 
an  important  role.  Furthermore,  no  assumptions  are  made  regarding  the  spatial 
distribution  of  temperature  and  reactant  concentrations  in  the  chamber. 

5.4.6  Unmodeled  Phenomena  and  Equipment 

As  stated  earlier,  even  with  powerful  tools  at  our  disposal,  development  of  a  com¬ 
prehensive  model  that  incorporates  every  relevant  feature  of  the  Epsilon-1  reactor 
is  not  practical.  Here,  we  discuss  some  of  the  unmodeled  features,  phenomena,  and 
processes  that  are  relevant  to  growth  in  the  Epsilon- 1  but  were  not  implemented 
in  our  models. 
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Si-Ge  growth  chemistry 


Although  reaction  schemes  including  gas  phase  and  surface  reactions  for  growth  of 
Si-Ge  from  silane  and  germane  precursors  have  appeared  recently  in  the  literature, 
experimentally  determined  physical- chemical  parameter  values  for  such  schemes 
are  proprietary  information  and  in  general  not  widely  available.  We  have  con¬ 
tacted  researchers  at  a  government  laboratory  [107]  regarding  future  experiments 
to  determine  rate  constants  and  sticking  coefficients  for  Si-Ge  growth. 

Development  of  models  for  Si-Ge  growth  is  particularly  complicated  due  to 
the  large  number  of  phenomena  involved  and  the  manner  in  which  the  deposited 
film  depends  on  the  process-equipment  state.  For  example,  epitaxial  Si-Ge  layers 
are  deposited  using  either  dichlorosilane,  silane,  or  disilane,  along  with  germane. 
Deposition  rate  and  germanium  content  have  been  observed  to  be  dependent  on 
the  choice  of  precursor  gas  [74]  and  germane  concentration  [73].  Furthermore, 
both  of  these  effects  have  been  observed  to  be  temperature  dependent  [38].  Thus, 
uniformity  may  be  affected  in  different  ways  by  non-uniform  concentrations  of 
different  reactants  at  various  temperatures. 

Radiative  heat  transfer,  lamp-house,  and  reflectors 

Radiative  heat  transfer  modeling  is  implemented  in  PHOENICS-CVD  via  view- 
factor  methods.  This  requires  a  discretization  of  the  solid  surfaces  in  the  chamber 
into  a  large  number  of  smaller  surfaces  that  are  considered  isothermal  and  of  con¬ 
stant  optical  properties.  The  predictive  capability  of  the  method  depends  on  the 
number  and  size  of  the  individual  surfaces,  which  we  refer  to  as  the  discretization 
resolution,  as  well  as  the  accuracy  of  the  chamber  geometry  implemented  in  the 
model. 
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Although  the  finite  volume  mesh  used  to  model  the  chamber  geometry  is  ef¬ 
fective  for  capturing  relevant  gas  phase  transport  phenomena,  it  neglects  various 
features  of  the  equipment  which  would  have  a  significant  effect  on  radiative  heat 
transfer.  Such  features  include  apparatus  containing  wafer  rotation  machinery 
within  the  lower  chamber  section  and  the  complicated  lamp-house  and  reflector 
equipment.  In  the  Epsilon- 1,  there  are  a  variety  of  reflector  designs,  including 
both  diffuse  and  specular,  and  some  of  a  special  parabolic  shape.  Also,  certain 
parameters  are  known  only  roughly,  such  as  the  power  supplied  by  the  lamps,  or 
equivalently  the  temperature  of  the  filaments  when  they  are  turned  on  to  100% 
power.  It  is  simply  not  practical  to  include  the  geometry  and  properties  of  these 
pieces  of  equipment  in  the  finite  volume  reactor  model. 

Furthermore,  in  the  Epsilon-1,  there  exist  many  transitions  from  one  material 
to  another,  gaps  between  different  pieces  of  the  equipment,  and  a  non-symmetric 
lenticular  shape.  This  necessitates  an  extremely  high  discretization  resolution  for 
viewfactor  modeling.  For  all  of  the  above  reasons,  we  believe  that  modeling  of 
radiative  heat  transfer  in  the  Epsilon-1  using  the  PHOENICS-CVD  framework  is 
too  unwieldy  and  computationally  expensive  to  be  practical. 

Currently,  solid  surfaces  are  modeled  as  isothermal,  i.e.,  constant  temperature 
within  each  individual  piece  of  apparatus  and  throughout  the  entire  wafer.  Tem¬ 
perature  values  are  set  according  to  empirical  data  supplied  by  the  manufacturer. 
In  reality,  temperature  gradients  exist  within  individual  pieces  of  equipment,  and 
the  average  steady-state  temperature  of  any  part  of  the  reactor  is  known  only 
roughly.  However,  it  is  not  likely  that  implementation  of  a  radiative  heat  trans¬ 
fer  model  using  the  PHOENICS-CVD  framework  would  produce  more  accurate 
surface  temperatures  than  manufacturer  supplied  empirical  data. 
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A  separate  model  restricted  only  to  heat  transfer  within  and  among  the  solids 
in  the  Epsilon- 1,  and  implemented  using  a  more  flexible  programming  environ¬ 
ment,  is  presented  in  Chapter  6.  It  is  desirable  to  incorporate  a  more  compre¬ 
hensive  conjugate  heat  transfer  model  to  reflect  the  relevant  geometrical  features 
at  sufficiently  high  resolution  to  provide  an  accurate  picture  of  the  temperature 
distribution  in  the  Epsilon-1  solids.  Steady-state  temperature  values  could  then 
be  used  as  boundary  conditions  in  a  refined  process-equipment  model. 

Wafer  rotation 

The  actual  mechanical  rotation  of  the  wafer  is  not  accounted  for  in  the  models. 
CFD  software  routines  for  modeling  of  rotating  objects  in  a  flow  environment  are 
available,  but  require  axisymmetry  of  the  entire  domain.  Therefore,  a  separate 
effort  would  be  required  to  investigate  the  effect  of  mechanical  wafer  rotation  on 
the  chamber  flow  immediately  surrounding  the  wafer.  The  effect  of  wafer  rota¬ 
tion  on  growth  can  be  studied,  partially,  by  performing  averaging  calculations  on 
simulation  results,  i.e.,  averaging  deposition  rates  around  the  wafer  surface. 

Rotation  shaft  purge 

There  is  an  additional  purge  gas  inlet  through  the  wafer  rotation  shaft.  Purge 
gases  injected  through  the  rotation  shaft  enter  the  chamber  directly  underneath 
the  center  of  the  wafer.  Recall  that  the  wafer  rests  on  small  pins  attached  to 
the  top  of  the  susceptor.  The  purpose  of  the  shaft  purge  is  to  prevent  the  source 
gases  from  flowing  between  the  wafer  and  susceptor  which  could  result  in  back-side 
deposition.  Typically,  the  shaft  purge  is  set  to  3  slm  H2.  This  may  have  an  effect 
on  the  chamber  flow  in  the  vicinity  of  the  wafer.  In  addition,  the  presence  of  the 
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rotation  shaft  would  affect  the  chamber  flow  in  the  lower  chamber  section.  The 


rotation  shaft  and  related  apparatus  are  not  included  in  the  process-equipment 
model. 

Deposition  on  chamber  walls 

The  process-equipment  model  treats  all  surfaces  other  than  the  top  surface  of  the 
wafer  as  non-reacting.  However,  deposition  on  other  surfaces  does  occur  in  the 
Epsilon- 1.  A  build-up  of  such  dims  is  prevented  by  preceding  the  deposition  step 
with  a  HC1  etch  clean.  In  Section  5.5.4,  we  slightly  modify  the  process-equipment 
model  so  that  the  back-side  of  the  susceptor  becomes  a  reacting  surface.  This 
limited  study  could  be  expanded  to  study  deposition  on  other  chamber  surfaces  as 
well. 

Chamber  wall  cooling 

The  quartz  chamber  and  lamp-house  are  cooled  by  air  flow.  There  is  little  data 
regarding  characteristics  of  the  air  flow  and  convective  losses  from  the  outer  wall 
of  the  chamber.  These  effects  are  not  modeled.  Instead,  a  constant  temperature 
is  set  for  each  of  the  chamber  walls. 


5.5  Results  and  Applications 

In  this  section  we  present  results  from  poly-Si  growth  simulations  using  the  Epsilon- 
1  process-equipment  model.  We  use  the  simulation  results  to  study  various  char¬ 
acteristics  of  the  thin  Elms  and  reactor  operation  that  impact  on  manufacturing 
effectiveness.  We  show  that  the  model  can  be  used  to  predict  growth  rate  and 
uniformity  and  to  better  understand  the  factors  that  influence  these  measures  of 
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performance.  We  also  present  simulation  results  that  provide  guidance  toward 
improved  setting  of  purge  gas  flow  rates. 

Because  deposition  times  are  typically  much  longer  than  initial  transients  in 
radiative  heating  and  gas  phase  transport  in  the  Epsilon- 1,  it  is  reasonable  to 
assume  that  all  growth  occurs  during  steady-state  operation.  For  this  reason,  all 
simulations  described  in  this  section  predict  steady-state  values  of  growth  rate  and 
other  variables. 

The  PHOENICS  input  code,  called  a  Q1  file,  used  for  simulating  poly-Si  growth 
at  750  C  temperature,  30  seem  silane  flow  rate,  and  20  Torr  chamber  pressure,  is 
presented  in  Appendix  I.  The  PHOENICS  code  for  the  other  simulations  that  we 
conducted  is  similar. 

5.5.1  Deposition  Rate  Prediction 

We  have  performed  poly-Si  growth  simulations  using  the  Epsilon- 1  process-equip¬ 
ment  model  to  study  the  relationship  between  growth  rate  and  the  various  process 
recipe  inputs.  Prediction  of  growth  rate  given  process  conditions  is  important  for 
taking  advantage  of  the  flexibility  of  the  Epsilon- 1.  The  manufacturer  provides 
some  predictive  guidance  and  data,  but  the  process-equipment  model  can  allow 
for  prediction  of  growth  rates  “off-the-curve”  and  also  provide  a  tool  for  perform¬ 
ing  multiple  trial- and-error  steps  and  for  understanding  the  factors  that  influence 
growth  rate  and  uniformity. 

Wafer  Temperature  Sensitivity 

Here  we  investigate  the  relationship  between  silicon  growth  rate  and  wafer  temper¬ 
ature  in  the  Epsilon-1.  We  have  already  studied  this  relationship  in  Section  5.3, 
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where  poly-Si  growth  rates  were  measured  experimentally  for  a  range  of  wafer  tem¬ 
peratures  (700,  725,  750  C)  and  silane  flow  rates  (30,  50,  70  seem)  with  pressure 
fixed  at  20  Torr.  It  was  shown  that  for  the  given  range  of  operating  conditions, 
a  simple  Arrhenius  law  provides  an  accurate  model  for  predicting  growth  rate  as 
a  function  of  wafer  temperature.  Arrhenius  model  parameters  such  as  activation 
energy  were  then  calculated  to  fit  the  experimental  data. 

In  contrast,  the  process-equipment  model  adopts  the  more  complicated  multi- 
step  Kleijn  model  for  silicon  deposition.  This  growth  model  includes  both  gas  phase 
and  surface  reactions,  involves  multiple  reactive  intermediaries,  and  is  coupled  to 
transport  models.  Because  of  the  model’s  complicated  nature,  it  is  reasonable  to 
expect  difficulty  in  isolating  the  effect  of  wafer  temperature  on  simulated  growth 
rate.  However,  using  reactor  simulations,  we  show  below  that,  like  the  experi¬ 
mentally  determined  growth  rates,  simulated  growth  rates  can  also  be  fitted  to 
a  simple  Arrhenius  law  relating  growth  rate  to  wafer  temperature.  Furthermore, 
the  single  activation  energy  in  the  Arrhenius  law  fitted  to  simulated  growth  rates 
is  nearly  the  same  as  that  which  was  fitted  to  experimentally  determined  growth 
rates.  Thus,  there  appears  to  be  an  underlying  dominant  chemical  mechanism  that 
obscures  the  effect  of  gas  phase  phenomena. 

The  predictive  capability  of  the  process-equipment  model  was  tested  by  sim¬ 
ulating  poly-Si  growth  using  operating  conditions  for  pressure,  temperature,  and 
flow  rate  that  are  duplicates  of  conditions  used  for  experiments  presented  in  [117]. 
The  boundary  conditions  used  for  the  simulations  are  given  in  Table  5.4.  The  sim¬ 
ulation  results  are  presented  in  Table  5.5  together  with  corresponding  experimental 
results  from  [117]. 

In  order  to  test  the  fit  of  simulated  growth  rates  to  an  Arrhenius  law,  we  plot 
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Boundary  Conditions  Used  In  Epsilon-1  Reactor  Simulations 

Process  Inlet  Conditions 

Carrier  Gas  H2 

Source  Gas  2%  SiH4  in  H2 


Condition 

Values 

Flow  Rate  of  Carrier  (slm) 

20 

20 

20 

Flow  Rate  of  Source  (slm) 

1.5 

2.5 

3.5 

Flow  Rate  of  Silane  (seem) 

30 

50 

70 

Mole  Fraction  of  Silane  (xlCU3) 

1.40 

2.22 

2.98 

Molar  Mass  of  Silane  (g/gmol) 

32.12 

32.12 

32.12 

Molar  Mass  of  Gas  Mixture  (g/gmol) 

2.06 

2.08 

2.11 

Mass  Fraction  of  Silane  (xlCU2) 

2.18 

3.43 

4.54 

Density  of  Gas  Mixture  (kg/m3  x  1CU3) 

2.25 

2.28 

2.30 

Velocity  of  Gas  Mixture  (m/sec) 

1.40 

1.47 

1.53 

Temperature  of  Gas  Mixture  (C) 

20 

20 

20 

Purge  Inlet  Conditions 
Purge  Gas  H2 


Condition 

Value 

Flow  Rate  of  Purge  Gas  (slm) 

7 

Velocity  of  Purge  Gas  (m/sec) 

0.45 

Temperature  of  Purge  Gas  (C) 

20 

Solid- Gas  Interface  Conditions 


Condition 

Values 

Temperature  of  Wafer  (C) 

Temperature  of  Susceptor  (C) 

Temperature  of  Ring  (C) 

Temperature  of  Front  Quartz  Shelf  (C) 
Temperature  of  Rear  Quartz  Shelf  (C) 
Temperature  of  Upper  Chamber  Wall  (C) 
Temperature  of  Lower  Chamber  Wall  (C) 

700  725  750 

700  725  750 

Conductive  Solid 
Conductive  Solid 
Conductive  Solid 

400  425  450 

400  425  450 

Table  5.4:  Boundary  conditions  for  process  gas  inlet,  purge  gas  inlet,  and  solid 
surfaces  used  in  simulations  of  poly-Si  growth  in  the  Epsilon- 1  reactor. 
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Process-Equipment  Model  Predictive  Capability 
Growth  Rate  Temperature  Dependence 


Process  Conditions 

Chamber  Pressure 

20  Torr 

Carrier  Gas 

20  slm  H2 

Source  Gas 

2%  SiH4  in  H2 

Purge  Gas 

7  slm  H2 

Growth  Rate  (A/min)  Vs.  Wafer  Temperature  and  Silane  Flow  Rate 


Experiment 

Simulation 

Wafer 

Temperature  (C) 

Silane  Flow  Rate  (seem) 
30  50  70 

Silane  Flow  Rate  (seem) 
30  50  70 

700 

65.68  80.08 

141.00  202.60  254.90 

725 

73.12  106.60  138.72 

233.00  336.30  423.90 

750 

118.28  171.32  216.68 

337.30  512.50  658.50 

Ratio:  Simulation  /  Experiment 

Wafer 

Silane  Flow  Rate 

(seem) 

Temperature  (C) 

30  50 

70 

700 

3.08 

3.18 

725 

3.19  3.15 

3.06 

750 

2.85  2.99 

3.04 

Mean 

3.07 

Standard  Deviation 

0.11 

Note:  Poly-silicon  thickness  for  750  C  and  30  seem  was  too  small  to  be  measured 
with  available  equipment. 

Table  5.5:  Results  comparing  poly-Si  growth  experiments  with  simulations. 
Growth  rates  are  averaged  over  wafer  surface. 
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the  logarithm  of  growth  rate  as  a  function  of  inverse  temperature,  for  each  of 
the  three  flow  rates  used.  The  plots  are  shown,  along  with  corresponding  plots 
of  experimental  data,  in  Figure  5.11.  We  can  fit  the  simulation  data  to  a  simple 
Arrhenius  law,  where  the  slope  of  each  plot  is  proportional  to  the  activation  energy. 
More  importantly,  the  slopes  of  all  plots  are  consistent  over  the  range  of  flow  rates, 
for  both  simulation  and  experimental  data.  This  means  that  the  activation  energy 
for  simulated  growth  is  nearly  identical  to  the  activation  energy  measured  for  actual 
growth  in  the  Epsilon-1.  Calculated  Arrhenius  parameters  for  experimental  and 
simulation  data  are  presented  in  Table  5.6. 

We  note  that  the  calculated  activation  energies  fall  within  the  range  of  pub¬ 
lished  activation  energies  for  deposition  of  silicon  from  silane  for  the  given  range 
of  process  conditions  (see,  e.g.,  [92]).  In  particular,  they  he  between  the  activation 
energy  for  silane  adsorption  (125  kJ/mol),  which  is  associated  with  with  temper¬ 
atures  above  700  C,  and  hydrogen  desorption  (192  kJ/mol),  which  is  associated 
with  temperatures  below  700  C.  Thus,  it  is  likely  that  these  are  the  dominant 
activating  mechanisms  for  both  actual  and  simulated  growth. 

However,  other  phenomena  also  play  a  role,  resulting  in  the  consistent  upward 
shift  from  experimentally  determined  to  simulated  growth  rates  observed  in  the 
Arrhenius  plots.  By  a  consistent  upward  shift,  we  mean  that  the  ratio  of  simulated 
to  experimentally  determined  growth  rates  is  a  constant  over  the  given  range  of 
operating  conditions.  We  calculated  this  constant  offset  factor  relating  simulation 
and  experimental  data  to  have  mean  value  3.07  with  standard  deviation  0.11,  as 
indicated  in  Table  5.5.  Thus,  the  process-equipment  model,  using  the  Kleijn  model 
for  poly-Si  growth  chemistry,  predicts  growth  rates  that  are  roughly  three  times 
greater  than  actual  growth  rates.  More  importantly,  this  factor  is  constant  over 
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the  selected  range  of  temperatures  and  silane  flow  rates. 


Figure  5.11:  Plots  illustrating  Arrhenius  relationship  between  poly-Si  growth  rate 
and  wafer  temperature  in  the  Epsilon- 1.  Experimental  and  simulation  data  is  taken 
for  three  silane  flow  rates  (30,  50,  and  70  seem)  and  three  temperatures  (700,  725, 
750  C)  at  20  Torr.  Simulated  growth  rates  (top  three  plots)  are  a  factor  of  3.07 
times  greater  than  experimentally  determined  growth  rates  (bottom  three  plots) 
consistently  over  the  given  range  of  temperatures  and  flow  rates. 


We  now  offer  some  ideas  toward  a  qualitative  explanation  of  the  presence  of  the 
offset  factor.  The  Kleijn  model  is  semi-empirical,  i.e. ,  it  is  based  on  phenomeno¬ 
logical  models  and  empirical  data  from  growth  experiments  performed  by  various 
investigators.  It  is  well  known  that  rate  constants  and  sticking  coefficients  for 
gas  phase  and  surface  reactions  are  difficult  to  measure,  and  reaction  rates  un¬ 
der  nominally  identical  process  conditions  will  vary  among  different  reactors  [128]. 
For  example,  Kleijn’s  study  [81]  used  a  cylindrical  cold- wall  chamber,  which  is 
very  different  from  the  lenticular  hot-wall  chamber  in  the  Epsilon-1.  Moreover, 
the  process  conditions  used  in  the  Kleijn  study  do  not  completely  match  those  of 
our  own.  For  example,  Kleijn  used  pressures  in  the  1-10  Torr  range  (compared  to 
our  20  Torr)  and  total  flow  rates  on  the  order  of  1  slm  (compared  to  our  >  20  slm) . 
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Parameters  of  Arrhenius  Relationship  Fitted  to  Data 
Experiment  vs.  Simulation 


Assumed  Relationship 

RSi  ~  C  eXP 

Process  Conditions 

Chamber  Pressure 

20  Torr 

Carrier  Gas 

20  slm  H2 

Source  Gas 

2%  SiH4  in  H2 

Purge  Gas 

7  slm  H2 

Symbol  Description 

Experiment 

FSiH4  Silane  Flow  Rate  (seem) 

Ea  Activation  Energy  (eV) 

Activation  Energy  (J/mol)  (xlO5) 

EajRg  Ratio  (K)  (xlO4) 

C  Pre-exponential  Constant  (A/min)  (xlO10) 

30  50  70 

1.69  1.67  1.57 
1.63  1.61  1.51 

1.96  1.94  1.82 

7.94  8.90  3.52 

Simulation 

FSiH4  Silane  Flow  Rate  (seem) 

Ea  Activation  Energy  (eV) 

Activation  Energy  (J/mol)  (xlO5) 

EajRg  Ratio  (K)  (xlO4) 

C  Pre-exponential  Constant  (A/min)  (xlO10) 

30  50  70 

1.30  1.48  1.55 

1.26  1.43  1.49 

1.51  1.72  1.80 

0.08  1.01  2.81 

Table  5.6:  Parameters  calculated  by  fitting  experimental  and  simulation  data  for 
poly-Si  growth  rates  to  an  assumed  Arrhenius  relationship. 


258 


The  temperature  ranges  do  coincide. 

On  the  other  hand,  the  consistency  of  the  offset  factor  over  a  range  of  temper¬ 
atures  and  silane  flow  rates  indicates  the  likelihood  that  activation  energies  and 
sticking  coefficients  in  the  Kleijn  model  are  close  to  those  we  would  find  by  per¬ 
forming  similar  experiments  in  the  Epsilon- 1.  It  appears  likely  that  the  offset  is 
due  more  to  approximations  and  neglected  effects  in  the  process-equipment  model. 
For  example,  discrepancies  between  actual  and  simulated  gas  phase  flow,  tempera¬ 
ture,  and  species  concentration  distributions  could  alter  the  relative  significance  of 
each  of  the  different  reactive  intermediaries.  Thus,  even  if  sticking  coefficients  for 
the  five  separate  surface  reactions  are  accurate,  total  growth  rate  would  be  shifted. 

Beyond  that,  the  coupling  of  gas  phase  reactions  and  transport  phenomena 
with  surface  chemistry  in  the  process-equipment  model  blurs  any  specific  cause 
and  effect  relationships.  It  must  also  be  emphasized  that  the  model  makes  a 
number  of  approximations  and  assumptions  whose  cumulative  effect  is  difficult  to 
pinpoint.  For  example,  the  wafer  geometry  is  approximated,  so  that  the  area  of 
the  wafer  consuming  reactants  may  not  be  modeled  accurately. 

By  taking  the  offset  factor  of  3.07  into  account,  the  model  as  it  currently  stands 
can  be  used  to  accurately  predict  silicon  growth  rate  in  the  Epsilon- 1  over  a  range  of 
temperatures  and  silane  flow  rates  typically  used  by  NG-ESSS.  However,  it  would 
be  preferable  to  improve  the  transport  component  of  the  model,  and  to  conduct  a 
more  extensive  experimental  study,  similar  to  Kleijn’s,  in  which  chemical  kinetics 
parameters  for  growth  in  the  Epsilon- 1  are  measured  over  a  wide  range  of  operating 
conditions. 
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Chamber  Pressure  Sensitivity 


Here  we  investigate  the  relationship  between  silicon  growth  rate  and  chamber  pres¬ 
sure  in  the  Epsilon-1.  Actual  growth  rate  is  expected  to  increase  as  total  pressure 
rises  due  to  the  increased  number  of  molecular  collisions  on  the  wafer  surface  and 
the  increased  reaction  rates  of  gas  phase  reactions.  Simulation  results  presented 
below  reflect  this  phenomenon. 

We  performed  poly-Si  growth  simulations  at  a  chamber  pressure  of  40  Torr 
using  the  same  temperature  and  silane  flow  rate  conditions  as  experiments  and 
simulations  performed  at  20  Torr.  However,  due  to  time  limitations,  only  two  flow 
rates  were  used  (50,  70  seem).  The  simulation  results  for  20  Torr  and  40  Torr  are 
presented  together  in  Table  5.7. 

As  before,  we  produced  Arrhenius  plots  and  calculated  Arrhenius  parameters 
in  order  to  see  how  pressure  affects  growth  rate.  The  results  are  presented  in 
Figure  5.12  and  Table  5.8.  We  see  that  growth  rate  increases  by  a  factor  of  1.26 
as  pressure  increases  from  20  Torr  to  40  Torr.  This  offset  factor  is  constant  over 
the  given  range  of  operating  conditions.  This  result  is  consistent  with  a  study  by 
Kleijn  [81]  in  which  growth  rate  increases  as  the  logarithm  (base  10)  of  pressure. 
Furthermore,  activation  energies  for  40  Torr  are  slightly  higher  than  those  for 
20  Torr  but  still  within  the  expected  range. 

Flow  Rate  Sensitivity 

We  have  studied  the  influence  of  silane  flow  rate  on  growth  rate  using  the  exper¬ 
imental  and  simulation  data  already  presented.  The  recipe  setting  for  silane  flow 
rate  directly  affects  two  process  variables  concerning  the  gas  mixture  at  the  inlet, 
namely,  silane  mole  fraction  and  overall  gas  velocity.  As  silane  mole  fraction  in- 
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Growth  Rate  Pressure  Dependence 
Epsilon- 1  Simulation 


Process  Conditions 

Carrier  Gas 

20  slm  H2 

Source  Gas 

2%  SiH4  in  H2 

Purge  Gas 

7  slm  H2 

Simulated  Growth  Rate  (A/min)  Vs. 
Wafer  Temperature  and  Silane  Flow  Rate 


Chamber  Pressure 

20  Torr 

40  Torr 

Wafer 

Temperature  (C) 

Silane  Flow  Rate 
50 

(seem) 

70 

Silane  Flow  Rate 
50 

(seem) 

70 

700 

202.60 

254.90 

270.00 

315.40 

725 

336.30 

423.90 

423.40 

510.00 

750 

512.50 

658.50 

651.90 

820.90 

Ratio:  40  Torr  /  20  Torr 


Wafer 

Temperature  (C) 

Silane  Flow  Rate  (seem) 
50  70 

700 

1.33  1.24 

725 

1.26  1.20 

750 

1.27  1.25 

Mean 

1.26 

Standard  Deviation 

0.04 

Table  5.7:  Results  from  poly-Si  growth  simulations  comparing  growth  rates  at 
20  Torr  pressure  with  growth  rates  at  40  Torr  pressure.  Growth  rates  are  averaged 
over  wafer  surface. 
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Parameters  of  Arrhenius  Relationship  Fitted  to  Simulation  Data 

Pressure  Dependence 


Assumed  Relationship 

RSi  =  C  exp 

(  ) 

V  RqTw  ) 

Process  Conditions 

Carrier  Gas 

20  slm  H2 

Source  Gas 

2%  SiH4  in  H2 

Purge  Gas 

7  slm  H2 

Symbol 

Description 

20  Torr 

40  Torr 

Asm, 

Silane  Flow  Rate  (seem) 

50 

70 

50 

70 

Ea 

Activation  Energy  (eV) 

1.48 

1.55 

1.52 

1.68 

Activation  Energy  (J/mol)  (xlO5) 

1.43 

1.49 

1.47 

1.62 

Ea/ Rg 

Ratio  (K)  (xlO4) 

1.72 

1.80 

1.77 

1.95 

c 

Pre-exponential  Constant  (A/min)  (xlO10) 

1.01 

2.81 

1.98 

14.77 

Table  5.8:  Parameters  calculated  by  fitting  simulation  data  for  poly-Si  growth 
rates  to  an  assumed  Arrhenius  relationship. 


creases,  the  contribution  of  gas  phase  reactions  to  the  overall  deposition  process 
will  be  enhanced  [81].  We  now  briefly  discuss  the  relationship  between  flow  velocity 
and  deposition  rate. 

It  is  typically  assumed  that  the  gas  stream  can  be  divided  into  two  regions.  In 
the  region  away  from  the  wafer  surface,  the  gas  stream  is  assumed  to  flow  with 
relatively  constant  velocity,  while  in  the  region  next  to  the  wafer  surface,  there 
exists  a  stagnant  boundary  layer  where  the  flow  velocity  is  zero.  In  this  model, 
mass  transfer  of  the  reactant  species  through  the  stagnant  layer  is  dominated  by  a 
diffusion  process.  The  mass  flux  T  impinging  upon  the  wafer  surface  is  proportional 
to  the  diffusion  coefficient  D  and  the  difference  between  the  reactant  concentration 
in  the  full  flow  Cg  and  at  the  surface  Cs,  and  inversely  proportional  to  the  thickness 
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1000/T  (1/K) 


Figure  5.12:  Plots  illustrating  Arrhenius  relationship  between  poly-Si  growth  rate 
and  wafer  temperature  in  the  Epsilon-1.  Simulation  data  is  taken  for  two  silane 
flow  rates  (50,  70  seem),  three  temperatures  (700,  725,  750  C),  and  two  chamber 
pressures  (20,  40  Torr)  .  Growth  rates  for  40  Torr  pressure  are  a  factor  of  1.26 
times  greater  than  growth  rates  for  20  Torr  pressure  consistently  over  the  given 
range  of  temperatures  and  flow  rates. 

of  the  boundary  layer  S ,  i.e., 

,t,  D  (C9  -  Ca) 

8 

Furthermore,  the  average  boundary  layer  thickness  5  is  inversely  proportional  to 
the  square  root  of  the  flow  velocity  V,  i.e., 

<5  -  Ci  V~1/2  (5.36) 

The  result  is  that  the  impinging  flux  of  reactants  T  is  proportional  to  the  square 
root  of  flow  velocity,  i.e., 

T  =  C2  V1/2  (5.37) 

We  express  the  relationship  between  deposition  rate  and  flow  velocity  as  a  power 
law 


RSi  =  C3V1/2. 


(5.38) 
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If  the  power  law  (5.38)  provides  an  accurate  model  of  the  relationship  between 
growth  rate  and  flow  velocity  in  the  Epsilon-1,  then  the  slope  of  a  plot  of  the  loga¬ 
rithm  of  growth  rate  versus  the  logarithm  of  flow  velocity  should  be  approximately 
0.5.  This  does  not  hold  true  (or  come  anywhere  close)  for  our  experimental  and 
simulation  data.  However,  we  were  able  to  determine  an  interesting  relationship, 
by  substituting  silane  flow  rate  Fs;h4  for  flow  velocity  V  in  (5.38)  and  letting  the 
power  law  exponent  vary,  i.e., 


RSi  =  CF£Hl  (5.39) 

where  a  denotes  the  power  law  exponent,  found  by  calculating  the  slope  of  log  Rsi 
versus  log  FSiH/,  •  The  log-log  plots  for  experimental  and  simulation  data  over  the 
range  of  temperatures  we  used  are  presented  in  Figure  5.13.  The  resulting  values 
for  a  are  given  in  Table  5.9.  We  see  that  the  power  law  exponent  a  in  (5.39)  is 
roughly  0.7. 

We  also  note  that  as  the  gas  mixture  flows  through  the  process  chamber,  it 
heats  up  and  consequently  its  density  decreases  and  its  velocity  increases  (see 
Section  5.5.3).  This  may  partially  account  for  the  exponent  being  greater  than 
0.5. 

Based  on  Equation  (5.35),  we  also  expect  growth  rate  to  be  proportional  to 
silane  concentration  and  hence  silane  flow  rate  at  the  inlet.  As  shown  in  Fig¬ 
ure  5.14,  this  relationship  holds,  with  a  temperature  dependent  proportionality 
constant,  reflecting  the  fact  that  the  process  is  thermally  driven.  This  is  in  contrast 
to  the  temperature  independent  nature  of  the  power  law  exponent.  We  emphasize 
that  the  power  law  is  a  purely  mass  transport  controlled  phenomenon. 
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Figure  5.13:  Plots  illustrating  power  law  relationship  between  poly-Si  growth  rate 
and  silane  flow  rate  in  the  Epsilon- 1.  Simulation  and  experimental  data  is  taken 
for  three  silane  flow  rates  (30,  50,  70  seem)  and  three  wafer  temperatures  (700,  725, 
750  C).  The  power  law  exponent  (slope  of  plots)  is  approximately  0.7,  consistently 
over  the  given  range  of  temperatures.  The  temperature  independence  of  the  power 
law  exponent  indicates  a  completely  mass  transport  controlled  phenomenon. 

Carrier  Gas  Sensitivity 

A  preliminary  investigation  of  the  relationship  between  growth  rate  and  carrier 
gas  was  conducted  by  simulating  poly-Si  growth  using  N2  carrier  gas  instead  of  H2 
carrier  gas.  Wafer  temperature  was  set  at  750  C  and  silane  flow  rate  was  set  at 
70  seem.  The  resulting  growth  rate  was  1661  A/min  which  is  a  factor  of  2.5  times 
greater  than  the  corresponding  simulated  growth  rate  using  H2  carrier. 

These  simulation  results  are  in  accordance  with  a  study  by  Kleijn  [81].  There, 
use  of  nitrogen  results  in  an  increase  in  buoyancy  effects  which  causes  an  increase 
in  the  average  residency  time  of  gases  in  the  reactor.  Thus,  gases  are  heated  for  a 
longer  period  of  time  and  the  contribution  of  gas  phase  reactions  becomes  greater. 
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Relationship  Between  Growth  Rate  and  Silane  Flow  Rate 
Experiment  vs.  Simulation 


Assumed  Relationship 

RSi  —  C  Ts;H  i 

Process  Conditions 

Chamber  Pressure 

20  Torr 

Carrier  Gas 

20  slm  H2 

Source  Gas 

2%  SiH4  in  H2 

Purge  Gas 

7  slm  H2 

Symbol  Description 

Experiment 

Simulation 

Tw  Wafer  Temperature  (C) 

a  Power  Law  Exponent 

700  725  750 

0.76  0.71 

700  725  750 

0.70  0.71  0.79 

Table  5.9:  Power  law  exponent  calculated  by  fitting  experimental  and  simulation 
data  for  poly-Si  growth  rate  to  an  assumed  power  law  relationship  between  growth 
rate  and  silane  flow  rate. 


Also,  thermal  diffusion  effects  are  weaker  in  nitrogen  than  in  hydrogen.  Both  of 
these  phenomena  cause  a  larger  growth  rate  in  nitrogen  than  in  hydrogen. 


5.5.2  Deposition  Uniformity  Prediction 

In  Section  5.2.3,  we  presented  an  argument,  based  on  anecdotal  evidence,  that 
temperature  uniformity  does  not  produce  deposition  thickness  uniformity  in  the 
Epsilon- 1  reactor,  even  for  thermally  activated  processes.  On  the  contrary,  ther¬ 
mocouple  offsets  are  set  so  that  the  temperature  distribution  on  the  wafer  surface 
is  intentionally  non-uniform.  This  occurs  because  of  the  various  transport  phe¬ 
nomena  that  couple  with  thermally  activated  chemical  mechanisms  to  influence 
silicon  deposition  rate  in  the  Epsilon-1. 

In  this  section  we  use  simulation  results  to  illustrate  the  phenomenon.  We 
simulated  poly-Si  growth  using  20  Torr  chamber  pressure,  750  C  wafer  tempera¬ 
ture,  70  seem  silane  flow  rate,  20  slm  H2  carrier,  and  7  slm  H2  purge.  The  750  C 
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700 


Deposition  Rate  vs.  Silane  Flow  Rate 


750  C 


0I - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 

30  35  40  45  50  55  60  65  70 

Flow  Rate  (seem) 


Figure  5.14:  Plots  illustrating  linear  relationship  between  poly-Si  growth  rate  and 
silane  flow  rate  in  the  Epsilon- 1.  Simulation  and  experimental  data  is  taken  for 
three  silane  flow  rates  (30,  50,  70  seem)  and  three  wafer  temperatures  (700,  725, 
750  C).  Slope  of  plots  are  temperature  dependent,  reflecting  the  fact  that  the 
process  is  thermally  driven. 

temperature  is  uniform  across  the  entire  surface  of  the  wafer.  Figure  5.15  shows  a 
contour  plot  of  the  resulting  steady-state  growth  rate  on  the  wafer  surface. 

Simulated  growth  rate  varies  from  a  minimum  of  628  A/min  at  the  downstream 
side  to  a  maximum  of  681  A/min  at  the  upstream  outer  edge.  This  represents  an 
8.4%  maximum  variation  in  growth  rate  across  the  wafer  surface.  If  thermal  activa¬ 
tion  were  the  sole  contributor  to  growth  rate,  then  the  8.4%  growth  rate  variation 
would  correspond  to  a  0.42%  maximum  variation  in  temperature.  However,  we 
know  this  is  not  the  case,  since  we  have  imposed  a  perfectly  uniform  tempera¬ 
ture  profile  on  the  wafer.  The  existence  of  other  contributing  factors  is  apparent. 
On  the  other  hand,  this  does  show  that  compensation  for  the  other  factors  may 
be  achievable  with  much  smaller  thermocouples  offsets  than  those  currently  used, 
which  create  a  maximum  temperature  variation  of  8.5%  between  thermocouple 
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■ 

n 


681.001 

679.201 

677.402 

675.603 

673.803 

672.004 

670.205 

668.405 

666.606 

664.806 

663.007 

661.208 

659.408 

657.609 

655.81 

654.01 

652.21 1 

650.412 

648.612 

646.813 

645.014 

643.214 

641.415 

639.615 

637.816 

636.017 

634.217 

632.418 

630.619 

628.819 


Figure  5.15:  Spatial  distribution  of  steady-state  deposition  rate  (A/min)  on  wafer 
surface  resulting  from  poly-Si  growth  with  750  C  uniform  temperature.  The  pic¬ 
ture  shows  a  non-uniform  deposition  rate  despite  the  uniform  temperature  profile. 
Process  conditions  are  20  Torr  pressure  and  70  seem  silane  flow  rate.  Gas  flow  is 
from  bottom  of  picture  (front /upstream)  to  top  of  picture  (rear /downstream). 


locations.  The  advantage  to  this  would  be  reduced  mechanical  stress  and  a  higher 
average  temperature  resulting  in  higher  growth  rates. 

It  is  also  worthwhile  to  examine  the  spatial  distribution  of  the  simulated  growth 
rate  non-uniformity.  Growth  rate  appears  to  increase  from  wafer  center  to  edges, 
and  from  front  (upstream)  side  to  rear  (downstream)  side.  Thus,  the  expected 
depletion  effect  appears  in  poly-Si  growth  simulations.  Non- uniformities  in  gas 
heating,  resulting  in  non-uniform  gas  phase  reactions  and  thermal  diffusion  may 
also  be  responsible  for  growth  rate  variations.  Further  simulations  will  be  required 
to  isolate  those  effects. 

Thermocouple  offset  values  described  in  Section  5.2.3  and  used  by  NG-ESSS 
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to  produce  uniform  growth  create  a  temperature  distribution  that  is  hotter  at  the 
front  (upstream)  than  the  rear  (downstream),  and  hotter  at  the  center  than  at 
the  side.  The  latter  seems  to  match  what  we  would  expect  given  our  simulation 
results,  i.e.,  cool  the  side  to  reduce  growth  rate  there.  On  the  other  hand,  it  is 
difficult  to  explain  the  former,  since  it  would  exacerbate  any  reactant  depletion 
effect.  Perhaps  the  simulation  understates  the  effect  of  downstream  gas  phase 
reactions. 

We  emphasize  that  in  actual  operation,  the  wafer  is  rotating,  so  that  growth 
rate  variations  are  averaged,  and  the  significance  of  front-to-rear  variations  be¬ 
comes  unclear.  We  cannot  draw  any  further  conclusions  at  this  time.  A  further 
experimental  study  of  the  effect  of  thermocouple  offsets  on  uniformity  is  necessary. 

5.5.3  Process  Chamber  Transport  Phenomena  Prediction 

In  this  section  we  study  the  gas  phase  transport  phenomena  in  the  process  cham¬ 
ber  of  the  Epsilon- 1.  As  stated  earlier,  these  effects  play  an  important  role  in 
determining  deposition  rate  and  uniformity  for  silicon  growth.  In  particular,  we 
want  to  observe  and  analyze  gas  flow  patterns  and  non-uniformities  in  the  spatial 
distribution  of  reactant  species. 

Simulation  results  described  in  this  section  are  for  poly-Si  growth  at  20  Torr 
pressure,  750  C  uniform  wafer  temperature,  450  C  chamber  wall  temperature, 
70  seem  inlet  silane  flow  rate,  20  slm  hydrogen  carrier  flow  rate,  and  7  slm  hydrogen 
purge  flow  rate. 

We  first  examine  the  flow  held  in  the  process  chamber.  Gases  are  pumped  into 
the  Epsilon-1  process  chamber  from  two  inlets:  the  process  gas  inlet  in  the  upper 
chamber  section  and  the  purge  gas  inlet  in  the  lower  chamber  section.  They  are 
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pumped  out  from  the  process  chamber  through  one  outlet  located  in  the  upper 
chamber  section.  Depending  on  process  and  purge  inlet  flow  settings,  it  is  possible 
for  gases  to  flow  from  upper  to  lower  chamber  sections  and  visa-versa. 

Figure  5.16  shows  a  view  of  the  simulated  flow  held  in  the  Epsilon-1  process 
chamber.  Several  features  are  of  interest.  We  observe  that  there  is  no  gas  how 
from  the  upper  chamber  through  gaps  to  the  lower  chamber.  Thus,  the  purge 
how  is  effective  in  this  regard.  Also,  gases  from  the  lower  chamber  enter  the  upper 
chamber  mainly  through  the  gap  toward  the  rear  of  the  chamber  and  also  somewhat 
through  the  gap  near  the  side  chamber  wall.  This  causes  the  how  in  the  vicinity  of 
the  wafer  to  be  directed  from  the  side  wall  toward  the  center  line  of  the  chamber. 
In  the  y-direction,  the  how  takes  a  parabolic  prohle,  i.e.,  slightly  faster  at  the  top 
of  the  chamber  than  near  the  wafer. 

The  contours  in  Figure  5.16  correspond  to  the  how  speed  in  the  ^-direction,  i.e., 
from  front  to  rear.  We  observe  that  the  gas  velocity  increases  from  front  to  rear. 
This  is  due  to  the  fact  that  the  gas  heats  up  as  it  passes  by  the  hot  chamber  walls 
and  wafer  level  apparatus,  causing  the  density  of  the  gas  mixture  to  decrease. 
However,  differences  in  density  do  not  cause  any  buoyancy  driven  recirculation 
cells  in  this  simulation.  This  is  because  the  how  velocity  is  relatively  high  and  the 
temperature  gradients  in  the  hot  wall  chamber  are  not  severe. 

The  heating  of  the  gases  is  observed  in  Figure  5.17,  which  shows  a  contour  plot 
of  simulated  temperature  distribution  in  the  Epsilon-1  process  chamber.  In  the 
vicinity  of  the  wafer  the  temperature  increases  from  side  to  center  and  from  wafer 
to  chamber  top  wall,  creating  a  highly  non-uniform  temperature  held  in  the  gas 
phase.  We  note  that  because  the  solid  surfaces  in  the  lower  chamber  section  are 
also  hot,  the  purge  gas  howing  up  through  the  rear  and  side  gaps  does  not  cause 
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Figure  5.16:  Steady  state  flow  pattern  of  gases  in  the  Epsilon-1  process  chamber. 
Process  conditions  are  20  slm  hydrogen  carrier,  70  seem  silane  source,  750  C  wafer 
temperature,  450  C  chamber  wall  temperature,  and  20  Torr  pressure. 

a  flow  of  cool  gas  to  enter  the  upper  chamber. 

We  now  study  the  spatial  distribution  of  the  concentrations  of  the  various 
reactant  gases  in  the  Epsilon-1.  Silane  enters  the  process  chamber  through  the 
process  inlet  in  the  upper  chamber  section.  It  is  also  produced  by  one  of  the  five 
gas  phase  reactions  in  the  Kleijn  model.  The  gas  phase  reactions  also  produce  the 
reactive  intermediaries:  disilane,  trisilane,  silylsylene,  and  silylene.  All  of  these 
gases  are  eventually  diluted  in  the  hydrogen  carrier  and  hydrogen  purge  gas. 

Figure  5.18  shows  a  contour  plot  of  simulated  silane  mass  fraction  distribution. 
Silane  mass  fraction  is  a  maximum  at  the  inlet  and  becomes  depleted  by  gas  phase 
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Figure  5.17:  Cross-sectional  view  of  the  steady-state  gas  phase  temperature  dis¬ 
tribution  in  the  Epsilon- 1  process  chamber  during  growth  of  poly-Si.  Process 
conditions  are  750  C  wafer  temperature,  450  C  chamber  wall  temperature,  20  slm 
hydrogen  carrier,  70  seem  silane  source,  and  20  Torr  pressure  . 

and  surface  reactions  as  the  gas  passes  over  the  heated  susceptor  and  wafer.  It  is 
at  a  minimum  in  locations  where  the  hydrogen  purge  gas  is  flowing  most  heavily 
into  the  upper  chamber  section,  in  particular,  at  the  side  and  rear  of  the  ring. 

The  concentration  distributions  of  the  other  reactive  intermediaries  are  illus¬ 
trated  by  contour  plots  in  Figure  5.19.  Because  they  do  not  enter  at  the  inlet, 
these  species  appear  in  the  flow  only  once  the  gas  is  hot  enough  for  them  to  be 
produced,  in  this  case  at  the  front  edge  of  the  susceptor  ring.  Like  silane,  these 
species  are  depleted  by  surface  reactions  at  the  wafer  surface.  In  fact,  we  observe 
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Figure  5.18:  Steady-state  silane  mass  fraction  distribution  in  the  Epsilon-1  process 
chamber  during  poly-Si  growth.  Process  conditions  are  750  C  wafer  temperature, 
450  C  chamber  wall  temperature,  20  slm  hydrogen  carrier,  70  seem  silane  source, 
and  20  Torr  pressure. 

that  silylene,  silylsylene,  and  trisilane  are  almost  completely  consumed  by  surface 
reactions.  On  the  other  hand,  some  disilane  remains  just  above  the  wafer  surface, 
although  it  is  at  a  maximum  in  areas  surrounding  the  wafer  perimeter. 

It  is  clear  that  the  spatial  distribution  of  reactant  species  concentrations  is 
strongly  influenced  by  the  flow  field,  the  gas  phase  temperature  distribution,  and 
surface  reactions.  We  suggested  earlier  in  this  report  that  thermal  diffusion  may 
also  play  a  role.  This  effect  is  more  difficult  to  isolate  and  identify. 

Figure  5.20  shows  two  contour  plots:  the  top  plot  is  for  silane  mass  fraction 
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and  the  bottom  plot  is  for  gas  phase  temperature.  Both  plots  are  snapshots  of 
the  x  —  z  plane  approximately  2  mm  above  the  wafer  surface.  In  the  area  just 
above  the  wafer  and  susceptor,  it  is  not  possible  to  isolate  any  effect  thermal 
diffusion  may  have,  i.e.,  separate  it  from  the  depletion  effect  caused  by  gas  phase 
and  surface  reactions.  However,  if  we  restrict  attention  to  the  area  between  the 
side  ring-shelf  gap  and  the  side  chamber  wall,  we  observe  a  silane  mass  fraction 
gradient  that  may  be  due  to  the  Soret  effect.  In  particular,  silane  mass  fraction 
increases  steadily  along  the  chamber  side  wall  and  side  ring-shelf  gap  from  front  to 
rear.  It  is  possible  that  the  relatively  heavy  silane  molecules  have  diffused  toward 
the  cooler  area  near  the  chamber  side  wall  at  the  front  ring-shelf  gap  and  away 
from  the  hotter  area  toward  the  rear.  Again,  we  emphasize  that  this  speculation 
needs  to  be  confirmed  by  conducting  additional  simulations  and  possibly  actual 
experiments. 

We  also  note  that  simulation  results  show  no  diffusion  of  silane  or  other  reactive 
intermediaries  into  the  lower  chamber  section.  Evidently,  the  convective  forces  of 
the  gas  flow  dominate  through  the  gaps  so  that  any  heavy  molecules  diffusing 
toward  the  lower  chamber  are  immediately  swept  back  into  the  upper  chamber. 

5.5.4  Purge  Flow  Optimization 

As  we  observed  in  Section  5.5.3,  the  7  slm  H2  purge  flow  is  effective  in  preventing 
any  source  gases  from  entering  the  lower  chamber  section.  In  particular,  the  mass 
fractions  of  silane  and  other  reactive  intermediaries  in  the  lower  chamber  were 
zero  for  those  simulations.  This  motivates  an  examination  of  the  relationship 
between  purge  flow  rate,  reactant  concentrations  in  the  lower  chamber,  and  possible 
deposition  on  the  back-side  of  the  susceptor.  The  objective  is  to  optimize  purge  flow 


274 


rate,  where  the  cost  to  be  minimized  is  proportional  to  the  amount  of  consumed 
H2,  and  any  back-side  deposition  is  unacceptable. 

Figure  5.21  shows  contours  of  silane  mass  fraction  and  flow  streamlines  resulting 
from  two  simulations,  each  using  a  different  H2  purge  flow  rate.  The  top  and 
bottom  figures  correspond  to  7  slm  and  2  slm  H2  purge  flow  rates,  respectively. 
We  observe  that  purge  flow  rate  has  an  effect  on  both  the  flow  pattern  in  the 
process  chamber  and  the  distribution  of  reactant  concentrations.  For  the  7  slm 
simulation,  steady-state  silane  concentration  in  the  lower  chamber  is  zero,  and 
streamlines  indicate  a  regular  smooth  flow  from  inlets  to  outlet,  with  purge  gases 
entering  the  upper  chamber  mainly  through  the  rear  ring-shelf  gap.  For  the  2  slm 
simulation,  silane  concentration  in  the  lower  chamber  is  nonzero,  and  the  flow 
held  becomes  irregular,  including  mixing  between  upper  and  lower  chambers  and 
recirculation  cells  in  the  lower  chamber. 

For  the  above  simulations,  we  modeled  both  the  front-side  of  the  wafer  and 
the  back-side  of  the  susceptor  as  reacting  surfaces.  We  note  that  the  wafer  and 
susceptor  have  different  material  properties,  but  we  modeled  the  back-side  of  the 
susceptor  as  if  it  were  the  back-side  of  a  silicon  wafer.  Process  conditions  were  set 
to  20  Torr  pressure,  750  C  wafer  temperature,  and  70  seem  silane  how  rate  at  the 
upper  chamber  inlet.  For  the  7  slm  purge  how  simulation,  no  back-side  deposition 
occurred,  and  average  front-side  deposition  rate  was  687  A/min.  For  the  2  slm 
purge  how  simulation,  back-side  deposition  rate  varied  from  205  to  359  A/min 
across  the  susceptor  back-side  surface,  and  average  front-side  deposition  rate  was 
719  A/min.  Apparently,  the  different  how  pattern  resulting  from  a  reduction  in 
purge  how  rate  also  causes  the  front-side  deposition  rate  to  increase. 

For  purposes  of  optimization  we  performed  one  additional  simulation  with  a 
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5  slm  H2  purge  flow.  Results  qualitatively  matched  those  for  the  7  slm  purge 
simulation,  i.e. ,  no  back-side  deposition  and  zero  silane  concentration  in  the  lower 
chamber  section.  Based  on  simulation  results,  we  can  reduce  the  flow  rate  of 
H2  purge  from  7  slm  to  5  slm,  thus  reducing  the  use  of  consumable  gases  while 
still  maintaining  purge  effectiveness.  However,  a  reduction  to  2  slm  is  too  much 
and  results  in  unacceptable  back-side  deposition.  The  optimum  purge  flow  rate  is 
somewhere  between  2  slm  and  5  slm.  We  did  not  proceed  further  with  this  study. 

5.6  Remarks 

We  have  presented  strong  anecdotal  evidence,  based  on  thermocouple  offsets  used 
by  NG-ESSS  to  produce  uniform  silicon  growth  in  the  Epsilon-1  reactor,  that  gas 
phase  transport  phenomena  play  an  important  role  in  determining  deposition  uni¬ 
formity,  even  for  thermally  activated  growth.  This  conjecture  is  in  agreement  with 
a  study  by  Kleijn  [81]  using  simulation  and  experimental  data  for  silicon  growth  in 
a  cold- wall  cylindrical  reactor.  Models  for  silicon  growth  that  cannot  be  coupled  to 
gas  phase  transport  phenomena  and  that  use  a  simplified  chemical  kinetics  model 
are  inadequate  for  describing  the  essential  physics  and  chemistry.  This  motivated 
the  development  of  a  3-dimensional  comprehensive  process-equipment  model  for 
silicon  growth  in  the  Epsilon-1,  incorporating  as  many  relevant  transport  effects 
and  chemical  mechanisms  as  was  feasible  from  a  practical  standpoint. 

The  process-equipment  model  provides  a  tool  for  prediction  of  deposition  rate 
and  other  process  variables,  i.e.,  the  process-equipment  state,  for  a  given  set  of 
recipe  inputs  (process  conditions)  and  equipment  settings.  The  predictive  capabil¬ 
ity  of  the  model  was  tested  by  comparing  results  of  poly-Si  growth  simulations  to 
experimental  data.  Simulations  predict  growth  rates  that  are  roughly  three  times 
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greater  than  actual  growth  rates,  consistently  over  the  given  range  of  operating 
conditions. 

Using  the  process-equipment  model,  we  performed  simulations  in  order  to  study 
the  factors  that  influence  deposition  rate  and  uniformity  for  silicon  growth  in  the 
Epsilon-1.  The  relationship  between  poly-Si  growth  rate  and  wafer  temperature, 
chamber  pressure,  silane  flow  rate,  and  hydrogen  carrier  flow  rate  were  investi¬ 
gated.  Although  we  used  a  complicated  model  for  poly-Si  growth  mechanisms 
developed  by  Kleijn  [81],  growth  rate  temperature  sensitivity  can  be  simplified  to 
an  Arrhenius  relationship.  Simulation  results  indicate  that  growth  rate  increases 
with  the  logarithm  (base  10)  of  chamber  pressure,  in  agreement  with  known  rela¬ 
tionships.  We  found  a  power  law  relationship  connecting  poly-Si  growth  rate  with 
silane  flow  rate  at  the  inlet,  with  power  law  exponent  roughly  0.7.  Finally,  we 
demonstrated  that  substitution  of  nitrogen  for  hydrogen  as  the  carrier  gas  results 
in  a  significantly  increased  deposition  rate. 

Simulation  results  showed  that  temperature  uniformity  does  not  guarantee  de¬ 
position  uniformity  in  the  Epsilon-1.  Simulations  using  a  uniform  wafer  temper¬ 
ature  in  the  thermally  activated  regime  produced  growth  rates  that  were  non- 
uniform  across  the  wafer  surface.  Thus,  it  is  apparent  that  achieving  deposition 
uniformity  requires  some  degree  of  temperature  non- uniformity  to  compensate  for 
the  effects  of  other  phenomena  including  reactant  depletion,  gas  heating  and  gas 
phase  reactions,  thermal  diffusion  of  species,  and  flow  patterns. 

We  have  taken  steps  toward  achieving  manufacturing  objectives.  Model  predic¬ 
tions  allow  NG-ESSS  to  simulate  growth  experiments  in  advance,  narrow  parame¬ 
ter  choices,  and  perform  fewer  actual  experiments.  Conditions  and  settings  can  be 
optimized  off-line,  taking  into  account  simulation  results  and  sensitivity  analysis 
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for  pressure,  temperature,  and  flow  rates.  The  effect  of  adjustments  to  wafer  tem¬ 
perature  set-point,  chamber  pressure,  source  gas  flow  rate,  thermocouple  offsets, 
and  injector  settings  can  be  predicted  and  tuned  off-line.  Simulation  results  show 
that  consumption  of  process  gases  can  be  reduced  by  decreasing  the  purge  gas  flow 
from  7  slm  to  5  slm  and  possibly  further  without  compromising  the  ability  of  the 
purge  gas  to  prevent  back-side  deposition. 
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Figure  5.19:  Steady-state  mass  fraction  distribution  for  reactive  intermediaries 
during  poly-Si  growth:  silylene  (top-left),  silylsylene  (top-right),  disilane  (bottom- 
left),  trisilane  (bottom-right).  Process  conditions  are  750  C  wafer  temperature, 
450  C  chamber  wall  temperature,  20  slm  hydrogen  carrier,  70  sccrn  silane  source, 
and  20  Torr  pressure  . 
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Figure  5.20:  Illustration  of  thermal  diffusion  (Soret  effect)  in  the  Epsilon-1  process 
chamber.  Contours  of  steady-state  silane  mass  fraction  (top)  and  temperature 
(bottom)  for  the  x  —  z  plane  approximately  2  mm  above  the  wafer  surface. 
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Figure  5.21:  Comparison  of  flow  streamlines  and  steady-state  silane  mass  fraction 
contours  for  hydrogen  purge  flow  rates  of  7  slm  (top)  and  2  slm  (bottom).  The 
higher  purge  flow  rate  results  in  zero  silane  concentration  in  the  lower  chamber 
section,  no  back-side  deposition,  and  regular  flow  from  inlets  to  outlet.  The  lower 
purge  flow  rate  is  ineffective,  producing  non-zero  silane  concentration  in  the  lower 
chamber  section  and  some  back-side  deposition. 
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Chapter  6 


Modeling  and  Reduction  for  RTP 
Heat  Transfer 

6.1  Introduction 

This  chapter  addresses  the  problem  of  deriving  low-order  models  for  RTP  control 
systems.  We  focus  on  one  particular  aspect  of  the  overall  process-equipment  model 
for  the  Epsilon- 1  reactor:  heat  transfer  within  the  solid  wafer  and  among  the 
wafer,  heat  lamps,  chamber  walls,  and  flowing  gases.  A  physical  model  of  the 
wafer  thermal  dynamics  is  formulated  in  Section  6.2,  accounting  for  conductive, 
radiative,  and  convective  effects. 

Control  of  the  temperature  distribution  on  the  wafer  surface  is  achieved  through 
several  independent  lamp  zone  actuators,  each  of  which  causes  a  different  set  of 
tungsten-halogen  lamps  to  irradiate  the  wafer.  The  lamp  heating  component  of 
the  heat  transfer  model  is  derived  in  Section  6.3,  based  on  several  simplifying 
assumptions  and  a  detailed  view  factor  analysis  of  the  Epsilon-1  geometry  and 
lamp  system  characteristics.  We  also  present  the  results  of  growth  experiments 
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that  provide  a  degree  of  empirical  validation  for  portions  of  the  view  factor  analysis. 
Although  contact  measurements  do  not  occur  during  an  actual  deposition  run,  we 
incorporate  a  set  of  thermocouples  into  the  model  to  play  the  role  of  temperature 
sensors. 

Given  the  RTP  heat  transfer  model,  we  derive  low-order  approximations  of 
the  evolution  equations  via  application  of  the  reduction  approaches  detailed  in 
Chapter  3,  i.e.,  POD  and  balanced  truncation.  A  comparative  study  is  presented 
in  Section  6.4,  in  which  the  effectiveness  of  the  two  approaches  is  examined  through 
numerical  simulations  of  the  full  and  reduced  RTP  control  system  models.  We 
summarize  and  make  some  additional  remarks  in  Section  6.5. 


6.2  Wafer  Heat  Transfer  Model 

The  model  for  wafer  heat  transfer  is  a  modified  version  of  models  presented  in 
[2,  28,  97,  138,  139].  It  is  based  on  an  energy  balance  for  a  heat  conducting  solid 
which  emits  and  absorbs  heat  radiation  at  its  boundary  surfaces.  The  model  takes 
into  account  simplified  effects  of  conductive,  radiative  (including  lamp  heating), 
and  convective  heat  transfer.  Both  a  continuum  model  and  a  discretized  version 
are  presented  here,  based  on  identical  principles  of  energy  balance.  Although  the 
wafer  is  a  continuous  solid  body,  a  discretized  model  is  required  for  purposes  of 
numerical  solution. 

Wafer  Characteristics 

We  assume  that  the  wafer  shape  is  perfectly  cylindrical.  Its  geometry  is  illustrated 
in  Figure  6.1.  The  heat  transfer  model  will  be  formulated  in  cylindrical  coordinates 
with  radial  variable  r,  azimuthal  variable  9,  and  axial  variable  z.  The  wafer  radius 
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Figure  6.1:  Wafer  geometry  used  in  the  heat  transfer  model. 

is  denoted  Rw  and  the  wafer  thickness  is  denoted  Az,  so  that  the  top  surface  of 
the  wafer  has  ^-coordinate  Az  and  the  bottom  surface  has  z-coordinate  0.  Note 
that  for  purposes  of  this  study,  the  wafer  and  susceptor  have  been  combined  into 
a  single  homogeneous  solid  body. 

We  assume  that  the  wafer  is  pure  silicon,  and  ignore,  e.g.,  the  layer  of  silicon 
dioxide  on  which  poly-silicon  is  deposited.  Thus,  we  use  thermal  and  optical 
properties  for  pure  silicon.  The  physical  constants  are  given  in  Appendix  G. 


Continuum  Model 


The  temperature  held  in  the  solid  wafer  is  denoted  Tw  =  Tw(t,r,9,  z)  where  t 
represents  time.  Time  evolution  of  Tw  is  governed  by  a  PDE  (usually  referred  to 
as  the  heat  equation )  which  models  heat  conduction  within  the  wafer,  together 
with  boundary  conditions  (BCs)  which  model  net  heat  how  to  and  from  the  wafer 
boundary  surfaces  (top,  bottom,  and  edge).  The  PDE  is  given  in  cylindrical  coor¬ 
dinates,  for  t  >  0,  0  <  r  <  Rw ,  0  <  9  <  2  7r,  and  0  <  z  <  Az,  by 


Pw  Cj 


Pw 


dTu, 

dt 


r  dr  y  w  dr  J  r2  d9  \  w  d9  J  dz  \  w  dz  ) 


(6.1) 


where  pw  is  the  mass  density  of  the  wafer,  CPw  is  the  heat  capacity  of  the  wafer 
(the  product  Mw  =  pw  CPw  is  often  referred  to  as  the  wafer  thermal  mass),  and  kw 
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is  the  thermal  conductivity  of  the  wafer.  The  associated  BCs  are  given  by 


=  0,  r  =  0  (6.2) 

l?edge(^>  %)  >  T  (6.3) 

(/bottom  (C  ^)  j  %  0  (6’4) 

=  Qtop  (r,0),  z=Az  (6.5) 

where  the  BC  (6.2)  results  from  symmetry  about  the  wafer  center,  and  (fedge,  (/bottom, 
and  qt0p  represent  the  net  heat  flow  per  unit  surface  area  to  and  from  the  wafer 
edge,  bottom,  and  top  boundary  surfaces,  respectively,  and  will  be  described  in 
more  detail  later. 

For  purposes  of  modeling  him  growth,  we  focus  our  attention  on  the  top  surface 
of  the  wafer  where  reactions  take  place.  Invoking  the  assumption  of  azimuthal 
symmetry,  so  that  no  temperature  gradients  exist  in  the  azimuthal  direction  (i.e., 
dTw/dQ  =  0),  time  evolution  of  Tw  at  the  wafer  top  surface  is  governed,  for  t  >  0, 
0  <  r  <  Rw,  and  z  =  Az ,  by 


dTm 


dr 

dT 

i.  Ulw 

r\ 

or 
dTw 

’  dz 

dTw 

’  dz 


k, 


Pw  Cj 


Pw 


dTw  1  d  (,  dTA  d_ 

dt  r  dr  \  w 


L 


dT 

V  ±  w 

dz 


(6.6) 


dr  I  dz 

\  /  \  / 
with  BCs  remaining  the  same  as  before  except  qedge  =  qedge(z),  (/bottom  =  ^bottom (r), 

and  qtop  =  QtoP(r).  We  also  assume  that  the  wafer  thickness  is  sufficiently  small 
so  that  no  thermal  gradients  exist  in  the  axial  direction  within  the  wafer  interior. 
Therefore,  we  approximate  the  axial  gradient  term  at  the  top  surface  by 


d  f1  dTw\  1  (l  dTw  ,  ;  dTw 

dz  ^  dz)  ~  Az  dz  |z=Az  kw  dz 

=  ((/top  *b  (/bottom)  (6-7) 


where  we  have  made  substitutions  using  BCs  (6.4)  and  (6.5).  The  resulting  PDE 
governs  the  evolution  of  the  wafer  top  surface  temperature  held  as  a  function  of 
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time  and  radial  position, 


Pw  Cj 


dT,„  1  d 


Pw 


dt  r  dr 


k„,  r 


dTw 

dr 


( 5top  T  (/bottom  ) 


(6.8) 


with  BCs 
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dTw 

dr 

dT 

L/-L  w 

dr 


=  0  ,  r  =  0 

(/edge  (^2)  j  r  Ru 


(6.9) 

(6.10) 


Now,  we  must  find  expressions  for  qtop  and  (/bottom,  the  net  heat  flow  into  the  top 
and  bottom  surfaces  of  the  wafer.  For  this  study,  we  assume  that  the  top  and 
bottom  surfaces  are  subject  to  identical  heat  transfer  mechanisms,  and  let 


,  _  em  1  ab  ,  conv  ,  chem 

<?toP  +  (/bottom  —  q  +  q  +  q  +  q 


(6,11) 


where  the  terms  on  the  right  hand  side  of  (6.11)  represent  the  flow  of  thermal 
energy  to  and  from  the  wafer  and  are  dependent  on  time,  position,  and  wafer 
temperature.  In  particular,  qem  is  radiative  energy  emitted,  qah  is  radiative  energy 
absorbed,  qconv  denotes  energy  losses  due  to  convective  heat  transfer,  and  (/'hem  is 
energy  transfer  due  to  the  heat  generated  by  chemical  reactions.  For  this  study, 
we  do  not  include  the  heat  of  chemical  reactions  and  consequently  ignore  (/hem . 

The  term  qem  represents  radiative  losses  from  the  wafer.  We  assume  qah  depends 
on  radiant  heat  flux  from  a  uniform  ambient,  in  this  case  the  chamber  walls,  and 
radiant  heat  flux  from  the  lamps,  but  without  reflections  or  other  effects.  The 
individual  terms  are  given  by 

qem  =  -2  ewabT*  (6.12) 

10 

qah  =  2  cxw  ah  Tc4  +  Qi  Ui  (6.13) 

i= 1 

where  (p,  denotes  the  Boltzmann  constant,  ew  denotes  the  wafer  emissivity,  aw 
denotes  the  wafer  absorptivity,  Tc  denotes  the  uniform  ambient  temperature  of 
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the  chamber  walls,  Qi  =  Qi(r)  is  a  function  of  position  describing  the  heat  flux 
intensity  incident  on  the  wafer  due  to  the  i-th  lamp  group,  and  Ui  =  Ui(t )  is  the 
time- varying  actuated  power  level  of  the  i-th  lamp  group. 

The  convective  term  is  given  by 

qconv  __  hv  (Tw_Tg)  (6.14) 

where  hv  denotes  the  convective  heat  transfer  coefficient  and  Tg  denotes  the  tem¬ 
perature  of  the  gas  flowing  past  the  wafer.  Note  that  we  have  assumed  a  constant 
uniform  gas  temperature.  In  order  to  estimate  hv,  we  assume  that  flow  in  the  pro¬ 
cess  chamber  is  a  laminar  flow  along  a  flat  plate.  The  mean  heat  transfer  coefficient 
is  given  in  [124]  (pp.  233-235)  as 

Re ^ 

where  kg  denotes  the  gas  thermal  conductivity,  Pr  denotes  the  gas  Prandtl  number, 
Re  denotes  the  gas  Reynolds  number,  and  L  denotes  the  length  of  the  chamber. 
We  have  computed  the  Reynolds  number  Re  to  be  approximately  27  for  the  flow 
in  the  ASM  Epsilon- 1  during  a  typical  deposition  run,  thus  confirming  the  lam¬ 
inar  assumption.  The  calculated  value  of  hv  was  then  validated  using  flow  and 
temperature  data  from  a  corresponding  CFD  simulation. 

Remark  6.2.1  It  is  sometimes  convenient  to  assume  that  the  wafer  is  a  graybody, 
so  that  ew  =  aw  for  all  relevant  wavelengths  of  radiation  and  wafer  temperatures. 
However,  we  do  not  make  this  assumption  here,  and  use  different  values  for  emis- 
sivity  and  absorptivity.  □ 

Remark  6.2.2  The  parameters  pw,  CPw,  and  kw  in  general  have  a  nonlinear  de¬ 
pendence  on  temperature,  and  can  be  modeled  as  polynomial  functions  of  Tw .  Like- 


(6.15) 


hv 


0.332  kgPr1/3 
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wise,  the  parameters  ew  and  aw  in  general  have  a  nonlinear  dependence  on  tem¬ 
perature  and  deposition  thickness.  However,  we  invoke  the  assumption  that  mass 
density,  heat  capacity,  thermal  conductivity,  emissivity,  and  absorptivity  are  con¬ 
stant,  i.e.,  no  variation  with  temperature,  film  thickness,  position,  or  time.  □ 


Given  these  simplifying  assumptions,  the  PDE  model  specializes  to 


dTw 

dt 


K  1  d  fdTw\ 

Pw  CPw  r  dr  \  dr  ) 


+ 


hr, 


Pw  Cpw  A 


(Tg  -  Tw)  + 


2  6Jf,  6,1 


Pw  CPw  Az 


a 


Pw  ^Z  rw  ^ Pw  z 

10 


(T‘  ~(tr$) 


^CrwAz  £ 


^  ^  Qi  Ui 


where  we  recall  that  Tw  =  Tw(t ,  r),  Qi  =  Qi(r),  and  u,  =  Ui(t). 

Since  the  guard  ring  insulates  the  wafer  from  radiation  directed  at  its  edge 
boundary  surface,  we  assume  zero  heat  transfer  at  the  wafer  edge  so  that 


0  (6.17) 

r  =  0  (6.18) 

r  =  Rw  (6.19) 

Discretized  Model 

The  continuum  model  as  given  by  the  above  PDE  and  BCs  can  be  discretized 
using  a  suitable  scheme,  e.g.,  finite  differences  or  finite  elements.  However,  for  our 
simplified  model  it  is  easier  to  formulate  a  discretization  by  applying  the  energy 
balance  principles  directly  to  individual  annular  elements  of  the  wafer.  The  general 
idea,  which  divides  the  wafer  into  annular  regions,  is  illustrated  in  Figure  6.2. 
Annular  regions  are  numbered  from  1  to  n  with  element  1  being  the  innermost  disk 


giving  the  BCs 


9edge 


dT.w 

dr 

dTw 

dr 


=  0, 


=  0, 
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radiation 

convection  conduction 


Figure  6.2:  Heat  transfer  mechanisms  affecting  annular  region  of  wafer. 

and  element  n  being  the  outermost  annular  region.  The  i-th  element  has  mean 
radius  r(i),  and  is  bounded  by  an  outer  cylinder  of  radius  rout,  inner  cylinder  of 
radius  rin,  top  surface  at  z  —  Az,  and  bottom  surface  at  z  =  0.  The  discretization 
is  uniform  so  that 

A  r  =  r(i)  —  r(j)  i,j&n  (6.20) 

is  constant  for  all  regions. 

The  usual  symmetry  assumptions  are  invoked  so  that  temperature  depends  on 
radial  position  and  time  only.  The  discretized  wafer  temperature  held  is  given  by 
the  n- vector  Tw(t),  where  the  i- th  entry  of  Tw(t)  represents  the  temperature  at 
radial  position  r(i)  and  time  t. 

The  wafer  heat  transfer  model  is  then  given  by  the  ODE 

Tw  =  Ac  Tw  +  Ar  T*  +  Av  Tw  +  T  +  5P  (6-21) 

where  Ac,  Ar,  and  Av  are  n  x  n  matrices  representing  the  effects  of.  respectively, 
conductive,  radiative,  and  convective  heat  transfer  mechanisms,  T  is  a  constant  72,- 
vector  that  accounts  for  the  gas  and  chamber  wall  temperature,  hisanxm  matrix 
derived  from  discretized  lamp  zone  radiant  intensity  profiles,  and  P  =  P(t )  is  a 
m-vector  of  control  inputs  corresponding  to  lamp  zone  power  levels.  We  present 
the  details  of  the  ODE  model  below. 
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The  top  surface  area,  volume,  and  mass  of  annular  region  i  are  given,  respec¬ 
tively,  by 


40 

=  tt  (rout(i)2  —  rin(*)2) 

V(i) 

=  S(i)Az 

(6.22) 

m{i ) 

=  Pw  v  ( i ) 

The  matrix  representing  conductive  heat  transfer  is  then  represented  by  the 
tri diagonal  matrix  given  by  the  entries,  for  i  —  2, . . .  n  —  1, 

2  kw  rout(i)  T  r;n(i) 
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(6.23) 


(6.24) 


pw  CPw  Ar  rlnt{n)  -  4(4 
where  we  note  that  zero  heat  flux  BCs  have  been  incorporated  into  the  model  via 
boundary  elements  of  matrix  Ac. 

The  matrices  representing  radiative  transfer  from  wafer  surface  to  chamber 
walls  and  convective  heat  transfer  from  the  process  gases  to  wafer  are  given,  re¬ 
spectively,  by 


Ar  =  diag 


(Jb  €v 


-<Jb  €v 


Pw  Cpui  Az  Pw  Cpw  Az  ^ 


(6.25) 
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and 


-h 


—hv 


Av  =  diag  ( —  hv 

\pw  Cpw  /\z  pw  CPw  A 


(6.26) 


where  we  note  that  Ar  and  Av  take  a  diagonal  form  as  a  result  of  the  simplifications 
made  in  our  model. 

The  effect  of  radiation  from  chamber  wall  to  wafer  and  convective  transfer  from 
gas  to  wafer  are  incorporated  into  the  constant  vectors  Tr  and  T,,  whose  entries 
are  given  by 


rr(*)  = 

£c  (Ji)  Oiw 

PwCPwAz  c 

i  G  n 

(6.27) 

r„(»)  = 

hv  rp 

Pw  Cpw  A  2  9 

i  G  n 

(6.28) 

where  ec  is  the  emissivity  of  the  quartz  chamber  walls.  These  effects  are  combined 
by  summing  into  one  constant  vector 


r  —  rr  +  r„ 


(6.29) 


The  lamp  heating  component  is  modeled  by  the  term  BP,  where  the  matrix  B  is 
referred  to  as  the  influence  matrix  and  P  is  the  control  input  vector  corresponding 
to  lamp  zone  power  levels.  The  influence  matrix  B  is  derived  from  the  heat  flux 
intensity  profiles  Qi{r),i  =  1  associated  with  each  of  the  independently 

actuated  lamp  zones.  The  details  of  these  flux  profiles,  or  influence  functions,  are 
given  in  Section  6.3. 

The  Epsilon- 1  is  equipped  with  four  independently  actuated  lamp  zones.  How¬ 
ever,  the  analysis  in  Section  6.3  (based  on  some  simplifying  assumptions)  yields 
only  three  independent  controls,  i.e.,  two  of  the  lamp  zones  produce  identical  flux 
profiles.  Thus,  we  have  m  =  3  control  inputs  in  our  model.  The  flux  profiles  are 
suitably  discretized  and  arranged  in  a  matrix  Q.  They  are  then  incorporated  into 
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the  influence  matrix  B  given  by 


Computation 


B  = 


a,, 


Pw  Cpw  ^2 


Q 


(6.30) 


To  avoid  problems  of  scaling  in  computational  work,  we  normalize  variables  and 
parameters  so  that  all  units  cancel,  i.e.  write  the  model  in  dimensionless  form. 
It  is  customary  to  adopt  a  notation  for  the  dimensionless  variables,  e.g.  Tw  be¬ 
comes  T„, .  Instead,  we  denote  the  dimensionless  variables  by  the  same  symbol  as 
their  dimensional  counterparts  and  caution  the  reader  to  keep  this  in  mind.  The 
conversions  are 


Tw  — »  ,  Qi  ~ >  ^ 


^  t  r 

Qref  T  R  w 


(6.31) 


It  has  been  observed  in  the  ASM  reactor  that  Tc,  the  chamber  wall  tempera¬ 
ture,  is  approximately  300  K  less  than  wafer  temperature  [105]  during  a  typical 
processing  run.  As  reference  values  we  select  a  wafer  temperature  of  1000  K  and 
chamber  wall  temperature  of  700  K.  The  reference  thickness  href  of  1.0  micron 
was  selected  because  it  is  on  the  order  of  the  thickness  of  films  we  are  interested 
in  growing.  The  reference  heat  flux  Qref  of  29.24  W/crn2  was  computed  using  the 
lamp  power  specification  of  6  kW  radiating  over  one-half  of  a  spherical  surface  area 
of  radius  57.15  mm  (2.25  in). 

Using  the  dimensionless  variables  as  given  above,  the  parameters  (matrices  and 
vectors)  in  equation  (6.21)  become 


A c  — >  —-T-  Ac ,  Ar  — >  t  T3  Ar ,  Av  — >  r  Av  , 

Rw 

rr  — >  —  rr ,  r„  — >—  r„,  b~>—b 


(6.32) 


to  yield  a  dimensionless  ODE  model  equivalent  to  (6.21). 
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Given  an  initial  temperature  profile,  TWo  =  Tw{ 0),  the  temperature  distribution 
on  the  wafer  surface  can  be  determined  as  a  function  of  time  and  radial  position  by 
numerically  integrating  the  ODE  (6.21).  Typically,  the  initial  condition  is  a  uni¬ 
form  temperature  held  set  to  ambient,  i.e.,  TWo  =  [700  . . .  700]T.  We  used  a  fourth 
order  Runge-Kutta  integration  scheme  to  perform  the  numerical  integrations.  The 
discretization  resolution  was  typically  set  at  n  =  101. 

See  Appendix  G  for  values  of  all  physical  constants  and  parameters  used  in  our 
simulations. 


6.3  Lamp  Heating  Model 

In  this  section  we  present  the  lamp  heating  component  of  the  wafer  heat  transfer 
model  presented  in  Section  6.2.  This  component  enters  as  the  control  input  term  in 
the  evolution  equation  (6.21)  for  wafer  temperature.  We  describe  additional  details 
of  the  physical  set-up  of  the  Epsilon- 1  lamp  apparatus,  derive  the  relationship 
between  lamp  power  settings  and  heat  flux  incident  on  the  wafer  surface,  and 
validate  the  analysis  using  experimental  data. 

Lamp  Equipment 

The  Epsilon-1  reactor  is  equipped  with  21  tungsten-halogen  lamps  for  heating  the 
wafer.  There  are  17  linear  lamps  (long,  thin  tubes)  with  a  maximum  power  output 
of  6.0  kW  and  four  spot  lamps  (spherical  bulbs)  with  a  maximum  power  output 
of  1.0  kW.  The  linear  lamps  are  organized  into  two  arrays,  referred  to  as  upper 
and  lower.  The  layout  is  illustrated  in  Figure  6.3.  The  upper  and  lower  lamp 
arrays  are  located  outside  the  process  chamber,  respectively,  above  and  below  the 
top  and  bottom  quartz  walls.  They  illuminate,  respectively,  the  top  surface  of  the 
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Figure  6.3:  Organization  (top  view)  of  upper  and  lower  lamp  arrays  and  spot 
lamps:  individual  lamps  are  assigned  to  lamp  groups  and  heat  zones  as  shown. 


wafer  and  the  bottom  surface  of  the  susceptor.  The  upper  lamps  are  arranged 
perpendicular  to  the  lower  lamps.  The  spot  lamps  are  located  directly  below,  and 
illuminate,  the  bottom  of  the  center  of  the  susceptor. 

The  power  to  individual  lamps  cannot  be  controlled  in  the  Epsilon- 1.  Rather, 
the  lamps  are  combined  into  groups,  which  are  further  combined  into  zones.  The 
organization  into  groups  and  zones  is  also  illustrated  in  Figure  6.3.  The  power  to 
each  of  the  four  lamp  heat  zones  is  controlled  independently  via  four  PID  feedback 
loops.  In  addition,  the  power  to  each  of  the  ten  lamp  groups  can  be  independently 
controlled  but  only  via  manual  settings.  However,  it  is  the  four  lamp  heat  zones 
that  normally  serve  as  the  actuators  in  the  Epsilon- 1  heating  system.  The  time- 
varying  percentages  of  full  power  P*  supplied  to  the  respective  zones  are  the  four 
control  inputs.  The  zone  name  roughly  corresponds  to  the  area  of  the  wafer  that 
receives  the  most  intense  illumination  from  the  particular  zone,  e.g.  center,  front 
(upstream),  rear  (downstream),  and  side. 
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Flux  Profiles 


Each  individual  lamp  creates  a  heat  flux  profile,  i.e.,  a  spatially  distributed  radiant 
intensity,  that  is  incident  across  the  wafer  surface.  We  refer  to  these  spatial  profiles 
as  influence  functions,  the  value  of  which  at  a  given  point  on  the  wafer  surface  is 
the  heat  flux  intensity,  measured  in  Watts  per  unit  area,  irradiated  onto  the  given 
point. 

For  purposes  of  this  study,  we  assume  that  the  shape  of  the  flux  profile  for  a 
given  lamp  is  determined  solely  by  the  geometry  of  the  lamp  and  wafer  appara¬ 
tus.  Other  factors  that  play  a  role,  but  are  unmodeled  here,  include  the  effect  of 
reflectors,  chamber  walls,  and  any  apparatus  within  the  chamber  enclosure  that 
either  reflects  or  absorbs  heat  radiation.  The  magnitude  of  a  lamp’s  flux  profile  is 
determined  completely  by  the  maximum  power  output  of  the  lamp,  i.e.,  6.0  kW 
for  the  linear  lamps  and  1.0  kW  for  the  spot  lamps. 

We  use  view  factor  analysis  to  compute  the  flux  profiles  Qi(r,  9),  as  a  function  of 
radial  position  r  and  azimuthal  position  9,  for  the  individual  lamps  in  the  Epsilon- 
1.  For  purposes  of  this  analysis,  we  ignore  the  apparent  symmetry  breaking  as 
described  in  Chapter  5,  and  assume  that  wafer  rotation  causes  azimuthal  symmetry 
of  lamp  radiation.  Azimuthal  averaging  accounts  for  wafer  rotation  and  results  in 
profiles  Qi(r )  as  functions  of  r  only.  The  view  factor  calculations  are  lengthy,  so 
we  present  the  details  and  results  of  this  procedure  in  Appendix  H. 

The  individual  flux  profiles  are  combined  using  superposition,  in  accordance 
with  the  previously  described  organization  of  lamps  into  groups  and  zones,  to 
produce  four  influence  functions,  each  corresponding  to  one  of  the  independently 
actuated  lamp  heat  zones.  They  are  shown  in  Figure  6.4.  Each  profile  Qi(r)  is 
modulated  by  a  corresponding  power  setting  Pt(t)  (control  input),  i.e.,  the  pro- 
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Heat  Flux  Intensity  Profiles  for  ASM  Reactor  Lamp  Zones 


Figure  6.4:  Heat  flux  intensity  profiles  for  ASM  Epsilon-1  heat  zones:  flux  intensity 
(W/cm2)  versus  radial  position  for  the  four  heat  zones. 

portion  of  full  power  applied.  The  flux  profiles  Qi  and  power  settings  Pj  are  then 
incorporated  into  the  evolution  equation  (6.21)  for  wafer  temperature  as  described 
in  Section  6.2. 

Remark  6.3.1  As  shown  in  Figure  6.4,  the  view  factor  analysis  yields  identical 
flux  profiles  for  the  front  and  rear  zones.  This  is  due  to  the  geometry  of  the  front 
and  rear  zones  with  respect  to  the  wafer,  wafer  rotation,  and  the  other  simplifying 
assumptions  invoked  in  the  model  development.  Thus,  we  use  only  three  indepen¬ 
dent  lamp  zone  control  inputs  in  our  control  system  model  for  the  Epsilon- 1  RTP 
system.  □ 

Remark  6.3.2  We  note  that,  given  accurate  flux  intensity  or  temperature  mea¬ 
surements,  one  might  be  able  to  experimentally  measure  the  flux  profile  for  a  given 
lamp  zone.  However,  it  has  been  the  experience  of  both  Northrop  Grumman  and 
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ASM  that  an  instrumented  wafer  used  to  take  such  measurements  is  unreliable. 
Moreover,  the  instrumented  wafer  that  is  available  for  these  purposes  is  capable  of 
measuring  temperature  at  only  nine  points  on  the  wafer  surface,  which  is  insuffi¬ 
cient  resolution  for  purposes  of  model  development.  Another  alternative  is  to  infer 
wafer  temperature  from  growth  rate  of  poly-silicon.  However,  this  method  is  still 
impractical,  since  a  prohibitively  large  number  of  measurements  for  wafer  location 
and  film  thickness  are  required  to  achieve  sufficiently  high  resolution.  □ 

Experimental  Validation 

We  do,  however,  use  an  experimental  approach  for  purposes  of  a  comparative 
validation  of  the  flux  profiles  that  were  determined  using  view  factor  analysis.  The 
procedure  was  as  follows: 

(i)  We  deposited  poly-silicon  on  a  non-rotating  wafer  for  a  fixed  period  of  time  r 
with  lamp  group  i  manually  set  to  power  setting  P.  Thickness  measurements 
were  taken  to  give  a  thickness  profile  hp)P)T)  (r,  0)  and  growth  rate  distribution 
R(i,p)(r ,  0)  =  h/r.  We  then  averaged  azimuthally  to  give  growth  rate  in  terms 
of  radial  position  for  a  rotating  wafer,  i.e.,  R^Pfr). 

(ii)  The  Arrhenius  law  for  growth  rate  (5.1)  was  inverted  to  determine  temper¬ 
ature  as  a  function  of  radial  position,  i.e., 

T(i,p)(r)  =  y  [hi  (ko  WSiH4)  -  In  (%P)(r)))]  1  (6. 

(iii)  The  temperature  held  T(i  Pfr)  was  substituted  into  the  evolution  equation 
for  temperature  (6.21)  in  the  wafer  heat  transfer  model.  Applying  the  steady- 
state  condition  T  =  0,  we  can  solve 

0  =  ACT  +  ArT4  +  AyT  +  T  +  BiP  (6.34) 
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for  the  discretized  influence  function  Bi{r). 


Isolation  of  the  individual  lamp  groups  was  achieved  by  operating  the  reactor 
in  manual  mode,  i.e.,  with  the  automatic  control  loops  for  temperature  regulation 
turned  off.  In  manual  mode,  the  lamp  groups  are  no  longer  organized  into  four 
zones  for  the  purpose  of  temperature  control.  Instead,  each  of  the  ten  groups  can 
be  toggled  on  and  off  individually,  and  the  power  setting  of  each  (between  0%  and 
100%)  can  be  set  manually.  To  isolate  a  particular  lamp  group,  all  others  were 
turned  off,  while  the  power  setting  for  the  lamp  group  being  tested  is  set  manually 
to  an  appropriate  level. 

The  wafer  was  heated  with  an  individual  lamp  group,  whose  power  setting  was 
adjusted  manually  until  at  least  one  of  the  thermocouple  readings  reached  the 
range  where  deposition  would  occur,  which  was  approximately  700  C.  The  exact 
temperature  readings  were  not  important  because  in  the  next  step  temperature 
would  be  inferred  from  thickness  data.  Then,  flow  of  silane  in  hydrogen  carrier 
was  started.  Silicon  was  deposited  for  five  minutes.  Wafer  rotation  was  turned  off 
so  that  effects  of  asymmetry  would  appear  in  the  resulting  deposition. 

This  procedure  was  followed  to  test  four  of  the  lamp  groups:  1,  8,  9,  and  10. 
Lamp  group  1  is  in  the  upper  lamp  array  and  radiates  directly  toward  the  top 
center  of  the  wafer.  Using  lamp  group  1  alone,  we  were  able  to  heat  the  wafer  to  a 
temperature  sufficiently  high  for  deposition  to  occur  and  to  record  sufficient  data 
for  analysis.  Lamp  groups  8,  9,  and  10  are  in  the  lower  lamp  array  and  radiate 
toward  the  bottom  of  the  susceptor.  Due  to  conduction  and  losses  throughout 
the  susceptor,  it  was  more  difficult  to  heat  the  wafer  using  each  of  these  lamp 
groups  alone.  Of  the  lamp  groups  we  isolated  in  the  lower  array,  only  lamp  group 
8  provided  enough  radiant  energy  to  heat  the  wafer  to  a  temperature  sufficiently 
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high  for  deposition  to  occur.  However,  wafer  temperature  oscillated  and  was  highly 
nonuniform  in  this  case,  causing  the  data  to  be  unreliable.  We  focus  now  on  the 
experiment  that  tested  lamp  group  1  from  which  reliable  data  was  obtained. 

Lamp  group  1  was  isolated  and  set  to  45%  of  full  power  which  brought  the 
center  thermocouple  reading  to  740  C,  sufficiently  high  for  deposition  to  occur. 
Silane  flow  rate  was  set  at  30  seem.  After  a  five  minute  deposition  period,  the 
wafer  was  removed  and  thickness  measurements  were  taken  at  100  points  on  the 
wafer  surface. 

Figure  6.5  shows  two  views  of  the  resulting  polysilicon  him  thickness  profile. 
Thermally  activated  growth  using  the  isolated  lamp  group  1  produces  a  “hill”  of 
polysilicon.  The  deposition  pattern  reaches  a  maximum  in  a  line  across  the  wafer 
center  parallel  to  the  lamps  in  lamp  group  1,  and  decreases  toward  the  wafer  edges. 
Qualitatively,  this  result  is  what  we  would  expect  given  the  geometry  of  lamp  group 
1  with  respect  to  the  wafer. 

The  thickness  data  is  then  used  to  compute  an  empirically  determined  heat 
flux  intensity  profile  for  lamp  group  1  as  outlined  previously.  Figure  6.6  shows  the 
result  along  with  the  analytically  determined  profile  for  purposes  of  comparison. 
The  result  indicates  a  reasonable  agreement  between  the  analytical  model  and 
experimental  data. 

6.4  Model  Reduction:  A  Comparative  Study 

In  this  section,  we  apply  the  POD  and  balanced  truncation  approaches  to  derive 
low-order  approximations  from  the  RTP  heat  transfer  control  system  model.  We 
compare  the  effectiveness  of  the  two  approaches  via  numerical  simulations  using 
the  full  and  reduced  RTP  control  system  models. 
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Poly-Si  Thickness:  5  Min  Deposition  Using  Lamp  Group  1  At  45% 
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Poly-Si  Thickness:  5  Min  Deposition  Using  Lamp  Group  1  At  45% 


Figure  6.5:  Two  views  of  polysilicon  film  thickness  profile  resulting  from  5  minute 
deposition  using  lamp  group  1  at  45%  power  and  silane  flow  rate  of  30  seem. 
Top  figure  shows  contour  map  where  colors/shades  represent  thicknesses.  Bottom 
figure  shows  3-dimensional  view  (“hill”). 
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Lamp  Group  #1  Heat  Flux  Intensity  Profiles 


Figure  6.6:  Experimentally  determined  heat  flux  intensity  profile  for  lamp  group 
1  along  with  analytically  determined  profile  for  purposes  of  comparison. 

6.4.1  RTP  Control  System 

Recall  that  in  Section  6.2  the  evolution  of  the  temperature  field  on  the  wafer  surface 
was  given  by  the  ODE 

Tw  —  Ac  Tw  +  Ar  +  Av  Tw  +  T  +  B  P  (6.35) 

To  model  the  measurement  of  temperature  at  discrete  points  on  the  wafer  surface 
via  thermocouples,  we  augment  the  nonlinear  state  equation  (6.35)  with  the  linear 
output  equation 

Ttc  —  CTW  (6.36) 

where  Ttc  is  a  p-vector  of  thermocouple  measurements,  and  C  is  a  p  X  n  matrix 
with  entries  corresponding  to  thermocouple  locations. 

Under  our  modeling  assumptions,  there  are  m  =  3  independent  lamp  zone 
control  inputs.  Although  contact  measurement  does  not  occur  during  an  actual 
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deposition  run,  we  incorporate  into  the  model  a  set  of  thermocouple  sensors  in 
contact  with  the  wafer,  by  assuming  that  there  are  thermocouples  in  contact  with 
p  =  3  locations  on  the  wafer  surface:  center,  edge,  and  midpoint  between  center 
and  edge.  We  ignore  the  actual  placement  of  thermocouple  sensors  in  the  Epsilon-1 
susceptor  ring. 

In  Section  6.4.3,  we  will  use  a  linearized  version  of  (6.35).  To  linearize,  first 
observe  that 

x  =  Acx  +  Ar  (x  +  T)4  —  Ar  T4  +  Av  x  +  Bu  (6.37) 

has  an  equilibrium  point  at  x  =  0  and  is  equivalent  to  (6.35)  under  the  changes  of 
variable  x  =  Tw  —  T  and  u  =  P  —  Pss,  where  Pss  is  the  control  input  that  results 
in  a  steady  state  temperature  held  of  Tw  =  T.  Linearizing  (6.37)  about  the  origin 
gives 

x  =  Ax  +  Bu  (6.38) 

with 

A  =  AC  +  AV+4F  (6.39) 

where 

=  [AP  q  (6,40) 

and  x  and  u  are  translations  of  Tw  and  P,  respectively.  The  output  equation  for 
the  linearized  control  system  is  given  by 

y  =  C  x  (6.41) 


6.4.2  POD  Approach 

We  follow  the  procedure  for  deriving  reduced  models  via  the  POD  method  as  de¬ 
tailed  in  Section  3.2.3.  To  generate  empirical  time  series  data,  i.e.,  snapshots  of  the 
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wafer  temperature  field,  the  nonlinear  control  system  (6.35)-(6.36)  was  simulated 
using  two  different  types  of  control  input  recipes.  (A  recipe  refers  to  a  function 
giving  the  lamp  power  setting  for  each  of  the  three  lamp  zones  at  each  instance 
of  time.)  They  are  referred  to  as  Ramp-Soak-Cool  (RSC)  and  Perturbation-Of- 
Constant  (POC). 

Control  Input  Recipe  -  Ramp-Soak-Cool 

The  RSC  recipe  mimics  a  typical  processing  recipe  in  which  a  lamp  zone  power 
setting  is  gradually  ramped  up  from  zero  to  full  power,  maintained  at  full  power 
for  a  specified  period  of  time,  and  then  gradually  ramped  down  from  full  to  zero 
power,  as  shown  in  Figure  6.7.  This  recipe  is  applied  to  one  of  the  lamp  zones 
individually,  while  the  other  two  zones  are  held  at  zero  power.  The  RSC  simulation 
is  then  repeated  for  each  of  the  other  two  lamp  zones.  In  this  manner,  the  system 
response  to  excitation  from  an  RSC  recipe  for  each  of  the  three  lamp  zones  will 
appear  in  the  time  series  data.  The  time  series  state-response  data  is  shown  in 
Figure  6.8. 

The  entire  ensemble  (three  sets)  of  time  series  data  is  combined  and  arranged 
into  a  data  matrix,  each  column  of  which  represents  one  “snapshot”  of  the  wafer 
temperature  held.  The  POD  basis  elements  and  associated  eigenvalues,  or  relative 
energy  values,  are  then  computed  via  SVD  and  ranked  according  to  magnitude  of 
relative  energy.  The  basis  elements  with  the  four  largest  eigenvalues  are  shown  in 
Figure  6.9.  Corresponding  relative  energy  values  are  compared  with  those  from 
applying  the  balancing  method  in  Table  6.2. 
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Ramp-Soak-Cool  Lamp  Power  Recipe 


Figure  6.7:  Lamp  power  settings  for  RSC  recipe. 


Wafer  Temperature  Field  Snapshots  for  RSC  Inputs 
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Figure  6.8:  Snapshots  of  wafer  temperature  field  with  RSC  input  and  uniform 
initial  temperature. 
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Figure  6.9:  Basis  elements  computed 


4.8% 


using  POD  from  RSC  empirical  data. 


Control  Input  Recipe  -  Perturbation  Of  Constant 

The  POC  recipe  applies  small  perturbations  of  a  pre-determined  set  of  constant 
power  settings  which,  if  left  unperturbed,  would  result  in  a  uniform  steady  state 
temperature  field  of  1000K.  The  perturbations  are  achieved  by  adjusting  the  power 
setting  of  each  lamp  zone,  one  at  a  time,  first  to  110%  and  then  to  90%,  of  the 
original  setting.  This  results  in  6  different  control  recipes,  as  shown  in  Table  6.1. 
Note  that  if  the  nominal  constant  power  settings  are  used,  then  the  wafer  tem¬ 
perature  field  will  evolve  as  a  uniform  field  for  all  time.  Thus,  the  perturbations 
are  used  to  elicit  a  response  that  would  be  characterstic  of  the  system  behavior  in 
response  to  certain  types  of  disturbances. 

The  system  response  to  excitation  from  each  of  the  six  POC  recipes  is  sampled 
and  combined  as  the  time  series  data  for  computing  POD  basis  elements.  Time 
series  snapshots  are  shown  in  Figure  6.10.  Once  again,  POD  basis  elements  are 
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Lamp  Power  Settings  for  POC  Recipe 

Note:  Settings  are  constant  for  all  time. 


Recipe 

Zone  1 

Zone  2 

Zone  3 

Punif 

0.0798 

0.4265 

0.1965 

p(  1) 

0.0718 

0.4265 

0.1965 

p(  2) 

0.0878 

0.4265 

0.1965 

p(  3) 

0.0798 

0.3838 

0.1965 

p(h 

0.0798 

0.4691 

0.1965 

p(  5) 

0.0798 

0.4265 

0.1768 

p{  6) 

0.0798 

0.4265 

0.2161 

Table  6.1:  Lamp  power  settings  for  POC  recipe. 


computed  and  ranked  by  corresponding  relative  energy  value.  The  basis  elements 
with  the  four  largest  eigenvalues  are  shown  in  Figure  6.9.  Corresponding  relative 
energy  values  are  given  in  Table  6.2. 


6.4.3  Balancing  Approach 

We  apply  the  balanced  truncation  procedure  as  detailed  in  Section  3.3.3  to  the 
linearized  control  system  model  (6.38)  and  (6.41).  We  note  that  the  realization 
(A,B,C)  is  nearly  non-minimal,  i.e.,  the  condition  numbers  of  the  Gramians  and 
their  product  are 


cond(lFc) 

=  3.2  x  1018 

(6.42) 

cond(fFo) 

=  3.8  x  1018 

(6.43) 

cond(Wc  Wa) 

=  9.4  x  1018 

(6.44) 

Remark  6.4.1  The  near  non-minimality  of  the  RTP  control  system  is  expected, 
since  lamp  influence  functions  and  initial  wafer  temperature  profiles  are  always 
smooth  and  relatively  uniform.  Hence,  any  non-smooth  or  spatially  fluctuating 
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Wafer  Temperature  Field  Snapshots  for  POC  Inputs 
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Figure  6.10:  Snapshots  of  wafer  temperature  field  with  POC  input  and  uniform 
initial  temperature. 
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Figure  6.11:  Basis  elements  computed  using  POD  from  POC  empirical  data. 
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Figure  6.12:  Left  basis  elements  for  balancing  transformation. 

temperature  profiles  are  almost  impossible  to  generate  using  the  available  control 
inputs,  and  likewise,  to  measure  using  three  thermocouple  sensors.  □ 

Thus,  it  is  necessary  to  use  the  Schur  method  of  Safonov  and  Chiang  (see 
Appendix  E)  in  order  to  alleviate  the  numerical  difficulties.  Using  this  method, 
we  derive  a  k- th  order  reduced  model  that  is  not  necessarily  balanced,  but  has 
transfer  function  G(s)  which  is  exactly  the  same  as  that  for  any  k-th  order  balanced 
realization,  thus  enjoying  the  same  attractive  error  bound. 

Application  of  the  Schur  method  to  (A,  B.  C )  yields  left  and  right  basis  elements 
for  a  coordinate  transformation,  shown  in  Figures  6.12  and  6.13.  Corresponding 
relative  energy  values  are  compared  with  those  from  the  POD  approach  in  Ta¬ 
ble  6.2.  We  note  that  the  condition  numbers  for  all  of  the  matrices  used  in  the 
Schur  procedure  are  less  than  1000. 
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Figure  6.13:  Right  basis  elements  for  balancing  transformation. 

6.4.4  Reduced  Model  Simulations 

Validation  of  the  predictive  capability  of  the  reduced  models  is  accomplished  by 
comparing  simulation  results  using  the  original  nth-order  model  with  simulation 
results  using  reduced  /cth-order  approximations  for  various  values  of  the  reduced 
model  order  k.  In  particular,  the  maximum  deviation  of  the  output  signals,  i.e. , 
thermocouple  readings,  between  the  original  and  reduced  models,  are  computed 
for  each  of  the  model  reduction  approaches  we  have  previously  described. 

The  test  simulations  use  a  uniform  700  C  initial  temperature  held,  and  two 
different  test  recipes  as  the  lamp  control  inputs.  The  test  recipes  are  different 
from  those  used  to  generate  the  RSC  and  POC  data  ensembles. 
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Test  Recipe  1  P  =  [0.5  0.5  0.5]  t  G  [0, 1] 


Test  Recipe  2  P  =  [1.0  0.0  0.0]  t  G  [0,0.4) 

P  =  [0.0  1.0  0.0]  t  G  [0.4,  0.7) 

P  =  [0.0  0.0  1.0]  t  G  [0.7, 1.0] 

The  k- th  order  reduced  models  for  k  =  1,  2,  3,  4,  and  5  and  full  n  =  101  order 
model  are  numerically  integrated  using  identical  control  recipes  and  initial  states. 
Simulated  thermocouple  readings  are  recorded  for  each  simulation.  The  reduced 
model  fidelity,  i.e.,  the  error  between  the  original  and  reduced  models,  is  computed 
as  in  (1.1,  via 

e(k)  =  \\Ttc  -  Ttc\\max  ,  k  —  1,2,  3, 4,  5  (6.45) 

where  we  define  the  norm  ||j/||maa:  for  time-dependent  p- vector  y(t )  as 

\\y\\max  =  max {yi{t)  :  0<t<oo,l<i<p}  (6.46) 

where  yfit)  corresponds  to  the  temperature  reading  of  thermocouple  i  at  time  t. 
Thus,  (6.45)  gives  the  maximum  deviation  between  actual  and  estimated  thermo¬ 
couple  readings  over  the  entire  simulated  time  sequence  and  over  all  three  thermo¬ 
couples,  i.e.,  a  “worst  case”  error. 

Remark  6.4.2  Due  to  the  shape  of  the  lamp  heat  flux  intensity  profiles  and  the 
smoothing  effect  of  the  diffusion  operator,  the  evolution  of  the  wafer  temperature 
field  does  not  produce  especially  interesting  behavior,  e.g.,  spatial  profiles  whose 
fluctuations  from  the  mean  vary  substantially  in  the  mean  square  sense  from  the 
initial  profile,  assuming  the  initial  profile  is  relatively  smooth.  Thus,  we  expect 
little  difficulty  in  capturing  the  essence  of  the  input-output  behavior  of  the  system 
in  a  low  dimensional  model.  Our  results  show  that  this  is  indeed  the  case.  □ 


310 


Percent  Energy  Associated  With  Transformation  Basis  Elements 


Method 

Mode  1 

Mode  2 

Mode  3 

Mode  4 

Mode  5 

POD  RSC 

95.06 

4.77 

0.14 

0.03 

0.00 

POD  POC 

93.43 

6.25 

0.23 

0.09 

0.01 

Balancing 

98.02 

1.83 

0.13 

0.02 

0.00 

Table  6.2:  Normalized  eigenvalues,  i.e.,  percent  energy,  corresponding  to  basis 
elements  used  in  model  reduction  for  POD  method  with  RSC  data,  POD  method 
with  POC  data,  and  balancing  approach. 

Tables  6.2  and  6.3  give  the  relative  energy  values  for  basis  elements,  and  the 
maximum  thermocouple  temperature  deviations  for  the  original  and  reduced  or¬ 
der  models.  Figures  6.14,  6.15,  and  6.16  show  simulated  thermocouple  readings 
resulting  from  simulations  with  test  recipe  1,  for  the  original  n  =  101  order  model, 
and  reduced  models  of  order  k  —  1,  2,  and  3.  Figures  6.17,  6.18,  and  6.19  show 
simulated  thermocouple  readings  resulting  from  simulations  with  test  recipe  2. 

We  observe  that  the  output  responses  of  the  full  and  reduced  systems  are 
similar.  In  particular,  using  the  test  recipes  as  control  inputs,  the  input-output 
behavior  of  the  wafer  heat  transfer  system  can  be  reconstructed  using  reduced 
models  of  order  4  so  that  thermocouple  readings  are  within  1  degree  C  of  the 
readings  using  the  original  model.  This  holds  whether  the  POD  or  balancing 
method  is  used,  and  for  whichever  set  of  empirical  data  was  used  for  computing 
the  POD  transformation.  Even  reduced  models  of  order  2  produce  a  reasonable 
approximation  (but  not  suitable  for  control  applications)  with  “worst  case”  errors 
less  than  15  degrees  C. 
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Maximum  deviation  (degrees  C)  between  outputs  of  original  and 

reduced  models 


Reduction 

Reduced  Model  Order 

Simulation 

Method 

1 

2 

3 

4 

5 

Test  1 

POD  RSC 

27.23 

2.68 

0.58 

0.11 

0.01 

POD  POC 

26.85 

1.26 

1.13 

0.10 

0.05 

Balancing 

50.68 

7.03 

0.44 

0.08 

0.02 

Test  2 

POD  RSC 

72.33 

5.22 

1.48 

0.18 

0.05 

POD  POC 

72.60 

4.79 

4.35 

0.43 

0.10 

Balancing 

80.81 

14.28 

1.70 

0.12 

0.04 

Table  6.3:  Maximum  deviation  (degrees  C)  between  outputs  of  original  and  re¬ 
duced  models  for  POD  method  with  RSC  data,  POD  method  with  POC  data,  and 
balancing  approach. 


Original:  n  =  101 


Reduced:  k  =  2 


Reduced:  k=  1 


Reduced:  k  =  3 


Figure  6.14:  Thermocouple  readings  for  original  and  reduced  models  with  Test 
Recipe  1  using  transformation  from  POD  RSC. 
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Original:  n  =  101 


Reduced:  k=  1 


Figure  6.15:  Thermocouple  readings  for  original  and  reduced  models  with  Test 
Recipe  1  using  transformation  from  POD  POC. 


Original:  n  =  1 01  Reduced:  k  =  1 


Figure  6.16:  Thermocouple  readings  for  original  and  reduced  models  with  Test 
Recipe  1  using  balancing  transformation. 
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1400 


Original:  n  =  101 


Reduced:  k=  1 


Figure  6.17:  Thermocouple  readings  for  original  and  reduced  models  with  Test 
Recipe  2  using  transformation  from  POD  RSC. 


Original:  n  =  101 


Reduced:  k=  1 


Figure  6.18:  Thermocouple  readings  for  original  and  reduced  models  with  Test 
Recipe  2  using  transformation  from  POD  POC. 
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Original:  n  =  1 01  Reduced:  k  =1 


Reduced:  k  =2  Reduced:  k  =3 


Figure  6.19:  Thermocouple  readings  for  original  and  reduced  models  with  Test 
Recipe  2  using  balancing  transformation. 

6.5  Remarks 


We  have  presented  physics-based  and  computational  models  for  heat  transfer  in  a 
silicon  wafer,  specifically  pertaining  to  RTP  in  the  Epsilon- 1  reactor.  The  models 
account  for  the  effects  of  conduction  within  the  solid  wafer,  convective  losses  to  the 
gas  phase,  and  radiative  losses  to  the  ambient.  We  model  radiative  heat  transfer 
from  lamps  to  wafer  by  determining  spatial  profiles  of  radiant  heat  flux  intensity 
for  individual  lamp  groups  and  lamp  zones.  These  radiant  intensity  profiles  were 
computed  analytically  using  view  factor  methods  and  then  validated  using  data 
from  poly-Si  growth  experiments.  The  model  does  not  account  for  the  effects  of 
gas  flow  patterns,  gas  phase  heat  transfer,  gas  phase  chemical  reactions,  and  other 
phenomena  affecting  the  rate  and  spatial  distribution  of  transport  of  chemical 
species  to  the  wafer  surface,  as  detailed  in  Chapter  5. 

The  complexity  and  high  computational  demands  of  RTP  models  motivated 
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a  study  of  model  reduction  techniques  to  derive  low-order  approximations.  A 
comparative  study  of  model  reduction  approaches  examined  the  POD  and  bal¬ 
anced  truncation  methods,  which  were  applied  to  the  RTP  control  system  model. 
Numerical  simulations  demonstrated  that  both  the  POD  and  balancing  methods 
produced  a  change  of  coordinates  that  allows  for  a  significant  reduction  in  model 
dimensionality. 

The  POD  method  appears  to  have  performed  slightly  better  than  the  bal¬ 
ancing  method  in  this  study,  although  both  performed  well.  One  reason  for  this 
result  is  that  the  balancing  transformation  was  computed  for  the  linearized  system, 
while  the  validation  tests  were  performed  for  the  reduced  order  nonlinear  system. 
Another  reason  is  the  relatively  simple  input-output  behavior  of  this  particular 
system,  i.e.,  there  is  little  difficulty  in  capturing  the  essential  system  behavior  with 
time  series  state-response  data.  The  empirical  eigenfunctions  of  the  flow,  and  their 
efficiency  for  purposes  of  representation,  are  relatively  insensitive  to  the  choice  of 
inputs.  However,  the  results  are  not  decisive  in  terms  of  determining  which  of  the 
two  methods  studied  was  more  effective  in  reducing  the  order  of  the  RTP  control 
system. 
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Chapter  7 


Conclusions  and  Future  Research 


We  have  motivated  problems  in  state-space  model  reduction  with  a  discussion 
and  analysis  of  prominent  state-of-the-art  approaches,  demonstrating  the  ad-hoc 
nature  of  their  application  to  deriving  low-order  models  for  nonlinear  systems.  In 
the  process  we  have  emphasized  computational  issues  and  potential  hazards  in  the 
hope  that  the  exposition  may  serve  as  a  useful  guide. 

In  light  of  shortcomings  associated  with  the  aforementioned  methodologies, 
we  addressed  the  problem  of  computability  pertaining  to  the  Scherpen  theory 
and  procedure  for  balancing  of  nonlinear  systems.  We  developed  useful  methods, 
tools,  and  algorithms  to  compute  the  associated  energy  functions  and  coordinate 
transformations.  We  applied  our  approach  to  derive,  for  the  first  time,  balanced 
representations  of  nonlinear  state-space  models. 

Because  of  the  computational  complexity  of  our  algorithms,  this  research  merely 
represents  a  first  step  toward  making  the  balancing  procedure  a  practical  reduction 
tool.  There  is  little  use  for  order  reduction  of  state-space  models  with  four  or 
fewer  states.  Faster  algorithms  will  be  required  to  balance  and  truncate  high- 
order  systems  of  interest  to  engineers  and  scientists.  Moreover,  we  have  explored 
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only  a  limited  class  of  systems  when  seeking  exact  formulas  for  the  controllability 
function.  Clearly  it  would  be  beneficial  to  find  results  with  broader  applicability. 

Our  research  in  modeling  of  RTCVD  for  silicon  growth  reflects  our  focus  on 
problems  of  practical  interest  to  our  industrial  partner.  Simulations  using  the 
process-equipment  model  provide  for  a  certain  degree  of  convergence  on  a  suitable 
set  of  operating  conditions  for  a  particular  process,  thus  avoiding  some  of  the  costly 
experimental  trials.  Furthermore,  there  is  value  in  an  enhanced  understanding  of 
the  factors  that  influence  deposition  rate  and  uniformity.  The  economic  advantages 
associated  with  accurate  prediction  of  processing  results  and  successful  implemen¬ 
tation  of  model-based  control  in  the  semiconductor  industry  will  continue  to  drive 
the  torrent  of  research  in  this  area.  Of  particular  interest  for  Si-Ge  epitaxy  on 
wafers  with  a  pre-deposited  oxide  pattern,  such  as  that  performed  by  NG-ESSS,  is 
recent  work  toward  the  integration  of  atomic  level  models  for  crystal  growth  with 
macroscale  models  for  gas  phase  transport  phenomena  [131]. 

We  have  derived  low-order  models  for  an  RTP  heat  transfer  control  system 
using  ad-hoc  versions  of  the  POD  and  balanced  truncation  approaches.  Although 
effective  in  this  case,  we  believe  that  ultimately  there  is  much  to  be  gained  from  de¬ 
velopment  of  a  more  systematic  methodology.  Further  research  toward  the  practi¬ 
cal  application  of  balanced  truncation  for  nonlinear  systems  appears  to  be  a  worthy 
goal. 
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Appendix  A 


Notation 


Definition 

Remarks 

n  =  {1,...,  n} 

set  of  natural  numbers  between  1 
and  positive  integer  n 

oo  N 

yai=  lim  y  a>i 

infinite  series 

A  =  [A}tj  =  [ay] 

matrix  with  numbers  or  functions  a^- 
in  the  i-th  row  and  j'-th  column 

□ 

end  of  definitions,  theorems,  remarks,  etc. 

■ 

end  of  proofs 
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Definition 


.  A  d 

x  =  —  x 

dt 


time  derivative 


derivative  vector 


V/  = 


gradient  vector 


a  ^  df 


divergence 


A/  =  Tprr  =  v  ■  V  /  Laplacian 

i=l  Vxi 


Df  =  P/L  = 


derivative  matrix  (vector) 


D2f  =  [D2f 


v  [ dxidxj 


second  derivative  matrix  (Hessian) 


x(t,tQ,xQ,u) 


solution  of  x  =  f{t ,  x,  u ) 
with  x(t0)  =  x0  and  u  =  u 


x(oo)  =  lim  x(t) 

v  7  t-¥0 O 


steady-state 


x(—oo)  =  lim  x(t) 

V  7  t-¥-o o  V  7 


steady-state  (reverse-time  system) 
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Hilbert  Spaces 


Notation 

Remarks 

C2{a,  b ) 

square- integrable  functions 

4 

square-summable  sequences 

T~Lx 

linear  operations  on  second-order  random  variable  X 

Norms 


Norm 

Argument 

Definition 

C2(a,b) 

/  G  C2(a,  b ) 

II  /  IU(„,t)  =  (/.*  II /(*)  II2*) 1/2 

Hankel 

G(s)  stable 

II  II  _  „11rl  II y  11^2(0,00) 

II  ^  \\h  -  SUpue£2(_oOi0)  ||  u  ||/;2(_OOj0) 

Hoo 

G(s)  stable 
transfer  function 

II  G  II 00  =  suPo,eR  Am«x  (G(-juj)T  G(jw )) 

Froebenius 

A  e  Hnxm 

II  A  ||F  =  (E”,i  E”  1  4) 1/2  =  (tr  ( AAt)),/ 2 
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Appendix  B 


Manifolds  and  Coordinates 


We  assume  throughout  this  thesis  that  the  state-space  takes  the  form  of  a  smooth 
manifold.  A  smooth  manifold  is  a  set  which  locally  can  be  identified  with  Rn 
together  with  the  intrinsic  notion  of  differentiability  defined  on  Rn.  In  order  to 
work  with  systems  that  evolve  on  a  smooth  manifold,  we  need  the  concepts  of  local 
coordinates,  local  representative  of  a  map,  and  coordinate  transformation.  These 
and  other  related  terminology  are  defined  below. 

The  material  contained  in  this  section  is  standard.  The  following  exposition 
is  drawn  from  texts  by  Nijmeijer  and  van  der  Schaft  [121],  Isidori  [69],  and  class 
notes  in  Geometric  Control  presented  by  Dayawansa  [37]  and  Krishnaprasad  [87] 
at  the  University  of  Maryland. 

First  we  define  terminology  needed  for  characterizing  functions  of  several  real 
variables,  i.e. ,  functions  defined  on  R" . 

Definition  B.0.1  (Homeomorphism)  A  function  f  :  A  C  Rn  — *  B  C  Rn  is 

said  to  be  a  homeomorphism  if  it  is  bijective  ( one-to-one  and  onto),  and  both  f 
and  f -1  are  continuous.  □ 
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Definition  B.0.2  (Smooth  Function)  Let  A  be  an  open  subset  o/Rn.  A  func¬ 
tion  f  :  A  C  Rn  — >  Rm  is  said  to  be  Ck  or  k  times  continuously  differentiable  if 
all  mixed  partial  derivatives  for  j  <  k 


&f 

dxf1  ■  ■  ■  dx 
exist  and  are  continuous, 
for  all  k. 


-  ai>0,i<En  J2  =  f  > 

i  i= 1 

The  function  f  is  said  to  be  C°°  or  smooth  if  f  is  Ck 

□ 


Definition  B.0.3  (Diffeomorphism)  A  function  f  :  A  C  Rn  — *  B  C  Rn  is 

said  to  be  a  diffeomorphism  if  f  is  a  homeomorphism  of  A  onto  B,  and  both  f  and 
f-1  are  smooth.  □ 


Definition  B.0.4  (Coordinate  Function)  The  function  r*  :  R"  — »  R  for  i  G  n 

defined  by 

?y(ai, . . .  ,a„)  =  Oj  (B.l) 

is  called  the  i-th  coordinate  function  or  slot  function  on  R"  .  □ 


The  following  terminology  is  related  to  functions  defined  on,  and  systems  that 
evolve  on,  a  smooth  manifold.  Some  basic  elements  of  point-set  topology  are 
required. 

Definition  B.0.5  (Topology)  Let  M  be  a  non-empty  set.  A  collection  T  of  sub¬ 
sets  of  M  is  said  to  be  a  topology  on  M  if 

(i)  M  and  the  empty  set  belong  to  T; 

(ii)  The  union  of  any  number  of  subsets  in  T  belongs  to  T; 

(in)  The  intersection  of  any  two  (and  hence  any  finite  number)  subsets  in  T  be¬ 
longs  to  T. 
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□ 

The  members  of  T  are  called  T-open  sets,  or  simply  open  sets,  and  the  pair 
(M,  T )  is  called  a  topological  space.  A  basis  for  a  topology  T  on  M  is  a  collection 
S  C  T  of  open  sets  in  T  such  that  every  open  set  can  be  wrtiten  as  a  union  of 
members  of  S.  A  neighborhood  of  a  point  p  in  M  is  any  open  set  which  contains  p. 

Definition  B.0.6  (Hausdorff)  A  topological  space  ( M ,  T )  is  said  to  be  Hausdorff 
if  any  two  different  points  p\  and  p2  have  disjoint  neighborhoods,  i.e.,  there  exist 
open  sets  A\,A2eT  such  that  pi  e  A\,  p2  €  A2,  and  A\  D  A2  is  empty.  □ 

Definition  B.0.7  (Continuous  Mapping)  A  mapping  F  :  Mx  — *  M2  between 
topological  spaces  (. M\,T\ )  and  ( M2,T2 )  is  said  to  be  continuous  if  F^1  (A)  e  Ti 
for  all  A  G  T2.  □ 

We  redehne  the  notion  of  homeomorphism  in  the  context  of  mappings  defined 
on  a  topological  space. 

Definition  B.0.8  (Homeomorphism)  A  mapping  F  :  M\  — »  M2  between  topo¬ 
logical  spaces  ( Mi,Ti )  and  ( M2,T2 )  is  said  to  be  a  homeomorphism  if  F  is  bijective 
and  both  F  and  F~l  are  continuous.  □ 

Definition  B.0.9  (Topological  Manifold)  A  Hausdorff  topological  space  (M,T) 
with  a  countable  basis  is  said  to  be  a  topological  manifold  of  dimension  n  if  for 
any  point  p  in  M  there  exists  a  homeomorphism  <f>  from  some  neighborhood  U  of 
p  onto  an  open  subset  of  Rn.  □ 

A  smooth  manifold  will  be  defined  as  a  topological  manifold  with  some  addi¬ 
tional  properties  relating  to  differentiability.  We  use  the  following  terminology. 
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Definition  B.0.10  (Coordinate  Chart)  Let  ( M,T )  be  a  topological  manifold 
of  dimension  n.  A  pair  (U,  (f)  with  U  G  T  and  (f  a  homeomorphism  from  U  onto 
an  open  subset  of  Rn  is  called  a  coordinate  chart  or  coordinate  neighborhood  for 
(. M,T ).  □ 

Definition  B.0.11  (Local  Coordinates)  Let  (U,<j>)  be  a  coordinate  chart  for  a 
topological  manifold  (. M ,  T )  of  dimension  n.  The  functions  defined  by 

Xi  =  ri  o  (f  i  G  n  (B.2) 

are  called  local  coordinate  functions  for  (U,<f>).  For  a  point  p  G  M,  the  values 
x\  (p), . . .  ,xn(p)  are  called  the  local  coordinates  of  p.  □ 

Definition  B.0.12  (Local  Representative)  Let  ( U ,  <f> )  be  a  coordinate  chart  for 
a  topological  manifold  (. M ,  T)  of  dimension  n.  Let  f  :  M  — *  R  be  a  map.  The 
function  f  :  <j>{U)  C  R"  — >  R  defined  by 

J=f°r'  (b.3) 

is  called  the  local  representative  of  f .  □ 

Remark  B.0.13  By  the  definition  of  the  local  representative  for  f ,  we  have 

f(p)  =  f  (®i(p),  ■  ■  ■ ,  xn(p))  (B.4) 

for  a  point  p  G  U .  We  use  the  shorthand  notation  f(x i, . . .  ,xn)  to  denote  both  of 
the  above  functions,  omitting  the  caret  and  suppressing  the  point- dependence.  □ 

Definition  B.0.14  (Coordinate  Transformation)  Let  (U,<j>)  and  (V.  ip)  be  co¬ 
ordinate  charts  for  a  topological  manifold  ( M ,  T)  of  dimension  n  such  that  U  fl  V 
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is  not  empty,  i.e.,  the  coordinate  charts  overlap.  Let  Xi  =  r*  o  (f>  and  Zi  =  fjO  if  be 
the  corresponding  coordinate  functions.  The  map  S  defined  by 

s  =  If  O  (/T1  :  (f>(U  nv)  ->  if(U  n  V)  (B.5) 

is  called  the  coordinate  transformation  from  local  coordinates  (xi, . . .  ,xn)  to  local 
coordinates  (zi, . . . ,  zn)  on  U  C\V .  □ 

Definition  B.0.15  (C^-Compatible)  Two  coordinate  charts  (U,  <p)  and  (V,if) 
are  said  to  be  C'°°-compatible  if  either  U  (~)V  is  empty  or  the  coordinate  transfor¬ 
mation  S  —  (j)  o  and  the  inverse  coordinate  transformation  S _1  =  if  o  cf^1  are 
both  smooth.  □ 

Remark  B.0.16  The  previous  condition  on  ( U,<f >)  and  (V,  if)  is  referred  to  as  the 
transition  property.  □ 

Definition  B.0.17  (C^-Atlas)  Let  (M,  T)  be  a  topological  manifold  of  dimen¬ 
sion  n.  An  indexed  collection  of  pairwise  C 00 -compatible  coordinate  charts  V  = 
{Ua,  (f>a}aeA  sa id  b e  a  on  M  if  UQgA  Ua  =  M .  □ 

Remark  B.0.18  The  previous  condition  on  V  is  referred  to  as  the  covering  prop¬ 
erty.  □ 

Definition  B.0.19  (Maximal  C°°- Atlas)  Let  (M,  T)  be  a  topological  manifold 
of  dimension  n.  A  C°° -atlas  on  M  is  said  to  be  maximal  if  any  coordinate  chart 
(V,  if)  which  is  C 00 -compatible  with  every  (Uai  (j)a)  £  V  is  also  in  V.  □ 

Remark  B.0.20  The  previous  condition  on  V  is  referred  to  as  the  maximality 

property.  □ 
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Definition  B.0.21  (Smooth  Manifold)  A  topological  manifold  (M,  T )  is  said 
to  be  a  smooth  manifold  if  there  exists  a  maximal  C°° -atlas  on  M.  □ 

Remark  B.0.22  Thus,  a  smooth  manifold  possesses  the  properties  of  covering, 
transition,  and  maximality.  □ 

Finally,  we  redefine  the  notions  of  a  smooth  mapping  and  a  diffeomorphism  in 
the  context  of  mappings  defined  on  a  smooth  manifold. 

Definition  B.0.23  (Smooth  Mapping)  Let  Mi  and  M2  be  smooth  manifolds  of 
dimension  n±  and  n2,  respectively.  A  map  F  :  M\  — »  M2  is  said  to  be  smooth  if 
for  each  p  e  Mi  there  exist  coordinate  charts  (U,  f>)  of  Mi  about  p  and  (V,  0)  of 
M2  about  F(p),  such  that  the  local  representative  F  =  if  o  F  o  0_1  is  smooth  (as  in 
Definition  B.0.2)  from  <f>(U)  C  Rni  into  tp(V)  C  JR"2.  □ 

Definition  B.0.24  (Diffeomorphism)  Let  Mi  and  M2  be  smooth  manifolds, 
both  of  dimension  n.  A  map  F  :  Mi  — >  M2  is  said  to  be  a  diffeomorphism  if 
F  is  a  homeomorphism  (as  in  Definition  B.0.8)  and  both  F  and  F are  smooth 
(as  in  Definition  B.0.23).  □ 
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Appendix  C 


Numerical  Simulation  of 
Stochastic  Differential  Equations 


In  this  appendix  we  present  a  numerical  scheme  used  for  simulation  of  the  system 
modeled  by  the  white  noise  driven  differential  equation 


A  m 

jtXl=f(t,Xt)  +  Y,g,(t,X,)(Qi  (C.l) 

and  discrete  time  approximation  of  white  noise  signals.  This  method  appears 
in  [94]  and  is  based  on  results  in  [164,  165]. 

As  stated  earlier,  in  order  to  simulate  (C.l)  we  numerically  integrate  the  SDE 


(dXt), 


f  (t  X)  I  1  V'  V  (t,  Xt 

f^Xt)+2hh^r 


9jk  (t)  At) 


dt 


+  'E,gi(t,Xt)(dWt)i  i£n  (C.2) 

i= 1 

The  justification  for  (C.2)  leads  to  a  computational  method  for  numerical  integra¬ 
tion. 

For  simplicity,  we  consider  the  time-invariant,  single-input  ( m  =  1)  case.  The 
results  extend  without  difficulty  to  the  general  case.  Consider  a  sequence  of  Gaus- 
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sian  processes  j (jn^ ,  t  €  H+|  which  converges  in  some  suitable  sense  to  a  white 
noise,  and  such  that,  for  each  n,  the  process  has  a  well  behaved  sample  path.  Then, 
for  each  n,  the  initial  value  problem 

jt  X[n)  =  f  (Xin))  +  g  {Xt])  Ct(n)  X^  =  x o  (C.3) 

can  be  solved,  resulting  in  a  sequence  of  processes  j xj"^ ,  t  e  R+|.  If  the  sequence 
Xf  ^  converges  to  Xt  then  it  is  natural  to  say  that  Xt  is  the  solution  of  (C.l). 

The  desired  sequences  are  derived  as  follows.  Consider  a  partition  0  =  t0  < 
ti  <  •  •  •  <  tr  of  the  interval  of  integration,  with  maximum  step  size 

A  =  max  (ti+ 1  -  U)  (C.4) 

For  each  integration  step,  dehne  a  polygonal  approximation  of  Wt 

W  A  _  W  a 

WtA  =  WA  +  (t  -U)  U<t<  ti+1  (C.5) 

H+l  ti 

and  a  corresponding  approximation  to  dWt 

dWA  =  WA  -  Wtf+1  u<t<  U+i  (C.6) 

Then,  since  the  polygonal  approximation  is  piecewise  differentiable  with  probabil¬ 
ity  1,  the  equation 

dX f  =  /  (Xf)  dt  +  g  (X,A)  dWA  (C.7) 

is  an  ODE  for  the  sample  function  XA.  It  is  shown  by  Wong  and  Zakai  [164,  165] 
that,  under  certain  conditions, 

hin  in  q.m.  XA  =  Xt  (C.8) 

where  Xt  is  the  unique  solution  of  the  SDE  (C.2). 

With  the  above  justification  in  hand,  we  proceed  to  show  how  we  can  approx¬ 
imate  a  white  noise  process  by  a  discrete-time  signal,  i.e.,  a  sequence  of  random 
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numbers,  for  purposes  of  numerical  integration.  The  key  detail  is  in  choosing  the 
statistics  of  the  random  numbers  in  a  manner  consistent  with  the  approximation 
scheme. 

Consider  the  discrete-time  approximation  to  a  continuous-time  white  noise  pro¬ 
cess 


C,A  =  4  Wf  =  |im 


-  Wt 


wf 


A 

+1 


WtA 


dt 


7i->  0 


h 


ti  —  t  —  ^i-\- 1 


(C.9) 


If  A  is  sufficiently  small  then  (A  is  Gaussian  with  zero  mean  and  variance  A.  To 
see  this,  observe  that,  by  definition  of  a  Wiener  process 


E\Wt.]=0  E 


Wu  Wtt+1 


=  t,  E 


wti+1  mi+1 


=  t 


i+ 1 


Thus  Var  ( Wt,^  -  Wu  )  =  A  and 


Var 


i 


i  =  1,2,... 


(C.10) 

(C.ll) 


The  desired  discrete  time  signal  is  a  sequence  of  Gaussian  random  variables  with 
zero  mean  and  variance 

It  is  typical  that  one  has  access  to  a  random  number  generator  that  can  generate 
a  sequence  of  zero  mean  unit  variance  Gaussian  random  variables  {  Z/,.  k  =  1,2,...}. 
In  this  case,  we  set 

/  1  \  1/2 

dWrk=  (-J  Zk  k  =  1,2,...  (C.12) 


as  a  discrete  time  approximation  to  dWt  within  a  suitable  scheme  for  numerical 
integration  of  (C.2).  A  comparative  study  of  numerical  integration  schemes  for 
SDEs  appears  in  [167].  We  used  a  4th-order  Runge-Kutta  scheme  to  integrate  (C.2) 
with  appropriately  chosen  time  step  A.  For  random  number  generation,  we  used 
the  built-in  linear  congruential  generator  of  MATLAB.  According  to  the  MATLAB 
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manual,  it  can  generate  all  floating  point  numbers  in  the  range  [2  53  ,  253]  and 
produce  21492  values  before  repeating.  This  was  easily  sufficient  for  our  purposes. 
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Appendix  D 


Proof  of  the  Proper  Orthogonal 
Decomposition  Theorem 


Continuous  Parameter  POD  (Theorem  3.2.5) 


Proof  For  simplicity  we  assume  without  loss  of  generality  that  the  process 
{Xt,  t  G  [a,  6]}  is  scalar-valued.  Suppose  that  the  functions  {0i,02,...}  satisfy 
the  integral  equation  (3.10)  and  that  the  random  variables  {ai,  a2, . . .}  are  defined 
by  (3.11)  so  that  the  orthonormality  condition  (3.8)  holds.  Define  for  each  N  = 


1,2,... 


Sjv(t)  =  \l'Kai<t>i{t) 


N 


i=  1 


The  POD  (3.7)  is  equivalent  to  the  statement 


(D.l) 


lim 

N— >oo 


Xt  -  SN{t ) 


I2 

l%x 


=  lim  E 

N— >oo 


Xt-SN(t)  I2 


=  0 


Observe  that 


(D.2) 


E 


\Xt-SN(t)\‘ 


=  E 


IXI2  +E  |^(t)|2  -2 E[XtSN(t)] 


(D.3) 
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Computing  the  terms  in  (D.3)  we  get 


E[\Xt\2]  =  R(t,t) 

r  n  _  n 

tf[lM*)l2]  =  E  E 


IV  _ 

EE  E  [ ai  ai  ]  &(*)  &(*) 

*=i  j=i 


=  EE  Av^ 

*=i  j=i 
N 

=  YMm\2 


Sij 


E[Xt  SN(t)  ]  =  E  XtVhA  <  {Ufa® 


=  EMP  *t(A)  J~  <t>i(s)Xsds 

N  rb 

=  YMt)  /  <Ms)£[xtxa]  d5 

i=i  •/a 

JV  rb 

=  E  <M0  /  R(t,s)<f>i(s)ds 

i= 1  Ja 

N 

=  E  (A*  <M*)) 

2=1 

N 

=  EAil^W!2 

Substituting  evaluated  terms  into  equation  (D.3)  yields 

JV  AT 

^  [|xt-  ^jv(t)i2]  =  i?(M)  +  Em(*)r-2EE^)P 

2=1  2=1 

N 


(D-4) 


(D-5) 


(D.6) 


(D.7) 


The  covariance  function  /?(•.  •)  is  Hermitian  symmetric  and  nonnegative  definite 
by  Propositions  (2.5.19)  and  (2.5.20).  It  is  also  continuous  on  [a,  b]  x  [a,  b]  by 
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q.m.  continuity  of  {Xt,  t  G  [a,  b}}.  Therefore,  the  hypotheses  of  Mercer’s  theorem 
hold  and  the  spectral  decomposition  exists,  given  by  R(t,s )  = 

Consequently 

OO 

R(t,t)  =  Y,  MU*)?  (D.8) 

i= 1 

where  convergence  is  uniform  for  t  G  [a,b\.  Equations  (D.7)  and  (D.8)  together 
imply  the  desired  result,  i.e. ,  limjv->oo  E[\Xt  —  S7v(f)|2]  =  0. 

Conversely,  suppose  {Xt,  t  G  [a,  b\ }  has  the  stated  expansion.  Then, 


Therefore, 


R(t,s)  =  E[XtXs 

OO 


=  E 


ai  4>i  (t)  y^j  aj  fij  {t) 


i= 1 
oo  oo 


3= 1 


=  E  E  vA*  vA  j  E  [  ci{  dj  ]  (pi  (£)  (pj  ( t ) 

i=l  j=l 
oo  oo  _ 

/  v  /  j  y  yj ^Piy^)  $j\^)  ^ ij 

i=l  j= 1 
oo 

=  A i  fait)  4>i(s) 

i= 1 


rb  fb  /  oo  \ 

/  i?(f,  s)  0j(s)  ds  =  E  Aj  ( i>j(t )  0j(s)  0j(s)  ds 

Ja  Ja  1  I 

oo  ,5 

=  EAt^W  /  Ms)  Ms)  ds 

3=1  Ja 

OO 

=  Aj  (f)j(t)  Sji 

3= 1 


(D.9) 


(D.10) 


Discrete  Parameter  POD  (Theorem  3.2.16) 

The  POD  theorem  can  be  proved  in  the  discrete  parameter  case  in  a  similar  fashion 
to  the  proof  in  the  continuous  parameter  case,  where  the  spectral  theorem  is  used 
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instead  of  Mercer’s  theorem.  However,  we  offer  the  following,  simpler  approach. 
Proof  Observe  that  the  sampled  data  POD  is  written  compactly  in  equa¬ 
tion  (3.29).  Substitution  of  (3.30)  together  with  orthogonality  of  $  yields  the 
desired  identity.  ■ 
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Appendix  E 


Algorithms  for  Linear  Balancing 


Given  a  minimal  realization  ( A ,  B,  C )  with  A  stable,  a  balanced  realization 
(<Sbai-1  AS'bai,  Sbai-1  B,  C  S'bai)  such  that  (3.89)  holds  can  be  obtained  through  the 
following  standard  algorithm: 

Algorithm  E.0.25  (Laub  [90]) 

(LI)  Compute  Wc  and  WQ  (solve  Lyapunov  equations  (3.83)  and  (3.84)  v^a  the 
Bartels- Stew  art  algorithm  [14])- 

(L2)  Compute  a  matrix  Lc  such  that  Wc  =  LCLCJ  (Cholesky  decomposition). 

(L3)  Form  the  matrix  LCJ  WQ  Lc  (matrix  multiplications). 

(L4)  Compute  an  orthogonal  matrix  U  and  a  diagonal  matrix  £  such  that 
LCJ  W0LC  =  UY,2  UJ  (spectral  decomposition). 

(L5)  Form  the  matrices  Sbai  =  Lcf/S-1/2  and  =  S1/2  UJ  L^1  (matrix  multi¬ 
plications,  matrix  inversion). 
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(L6)  Form  the  balanced  state-space  matrices  (SW ~l  A  S'bai,  Sbai_1  B,  C  S'bai)  (ma¬ 
trix  multiplications) . 

□ 

Remark  E.0.26  An  algorithm  presented  by  Moore  [109]  is  essentially  the  same 
except  that  a  spectral  decomposition  replaces  the  Cholesky  decomposition  in  step 
(L2).  □ 

The  following  improvement  to  the  standard  algorithm  is  more  elegant  numer¬ 
ically  in  that  it  computes  the  Cholesky  factors  without  actually  solving  the  Lya¬ 
punov  equations  for  the  Gramians,  and  computes  the  SVD  of  a  product  of  Cholesky 
factors  without  explicitly  forming  their  product.  This  is  important  in  computing 
small  singular  values  accurately. 

Algorithm  E.0.27  (Laub,  et.al.  [91]) 

(LH1)  Compute  matrices  Lc  and  La  such  that  Wc  =  LCLCJ  and  WQ  =  L0L0J .  The 
Cholesky  decompositions  are  performed  without  ever  forming  the  Gramians 
via  the  algorithm  of  Hammarling  [6f]. 

(LH2)  Compute  orthogonal  matrices  U,  V  and  a  diagonal  matrix  £2  such  that 

L0J  Lr  =  UT?  VT.  The  SVD  is  performed  without  ever  forming  the  product 
L0J  Lr  via  the  algorithm  of  Heath,  et.al.  [65]. 

(LH3)  Form  the  matrices  S'bai  =  TcVYTx!2  and  S^J  =  IS1/2  UT  LQT  (matrix  mul¬ 
tiplications). 

(LHf)  Form  the  balanced  state-space  matrices  (matrix  multiplications) . 

□ 
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Remark  E.0.28  The  computational  complexity  of  these  algorithms  are  roughly 


the  same,  i.e.,  O  (n3)  with  pre-multiplier  in  the  40-50  range.  □ 

Nearly  Non-Minimal  Systems 

We  now  present  an  algorithm  for  dealing  with  nearly  non-mininral  systems. 

Algorithm  E.0.29  (Safonov  and  Chiang  [136]) 

(SCI)  Compute  n  x  k  matrices  Vr^ig  and,  Vi^g  whose  columns  form  bases  for  the 
respective  right  and  left  eigenspaces  ofWcWQ  associated  with  the  “big”  eigen¬ 
values  af, ...  ,crj;.  This  is  done  via  the  ordered  Schur  decomposition  ofWcWQ 
as  follows. 

(a)  Compute  Wc  and  WQ. 

(b)  Compute  an  orthogonal  matrix  V  such  that  V WCW0VT  is  upper  trian¬ 
gular,  i.e.,  put  WCWQ  into  Schur  form.  The  fact  that  Wc  and  WQ  are 
real  and  symmetric  ensures  the  existence  of  a  real  Schur  transformation 
matrix  V . 

(c)  Compute  orthogonal  transformations  Va  and  Vd  which  order  the  Schur 
forms  in  ascending  and  descending  order,  respectively, 


v?wcw0va 

=  diag  (A0>n, . 

■  ■  j  Aa,i) 

+  Ta 

(E.l) 

vjwcw0vd 

=  diag  ( Ad,  i,  •  ■ 

■  ■  i  A d,n) 

+  Td 

(E.2) 

where  Ta  and  Td  are  strictly  upper  triangular  and  such  that 

{Aa,l,  •  •  ■  >  Aa,fc}  =  {'Vi,  .  .  •  ,  Xd,k}  =  Wi,  •  •  •  ,  (E.3) 

•  J  A0,n}  {Ad,fc+1>  •  •  •  j  ^d,n}  "L^fc+l)  •  •  -  )  &n\  (E.4) 
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(d)  Partition  Va  and  Vd  as 


Va  = 


Vd  = 


n—k  k 

^r, small  \  ^l,big 

k  n—k 

Vr,big  |  small 


(SC2)  Form  Eus  =  V^Ug  K,big  and  compute  the  SVD 

Ebig  =  ^E, big  ^E, big  ^E, big 

(SC3)  The  not  necessarily  balancing  transformations  are 

'-’l,big  —  M,big%,big^E,big 
Sr,big  =  Vr  fiig  V E  ,bigP  E  Jycg 

(SC4)  The  reduced  model  is  given  by 

(A  B,  C)  =  (S?MgASrMg,  S^bigB,  CSrMg ) 


(E-5) 

(E.6) 


(E-7) 


(E.8) 

(E.9) 


(E.10) 

□ 


Gradient  Flow  Methods 

Finally,  we  briefly  describe  the  gradient  flow  method  of  Helmke  and  Moore  [66]. 
Consider  the  usual  linear  transformation  x  =  S  z  and  let  T  =  S'-1  so  that  z  =  T  x. 
To  get  a  quantitative  measure  of  how  the  Gramians  change  under  the  transforma¬ 
tion,  the  authors  use  the  cost  function 

<j>{T)  =  tr  (TWCTT  +  (TT)_1  WqT (E.ll) 

which  corresponds  to  the  sum  of  the  eigenvalues  of  the  transformed  Gramians. 
Define  the  symmetric  matrix  P  =  TJ  T  so  that  the  cost  function 

if(P)  =  tr  (\VCP  +  W0  P-1)  (E.12) 
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is  equivalent  to  the  cost  function  <f>.  The  authors  show  that  the  cost  function  tfj 
has  compact  sublevel  sets  therefore  ensuring  the  existence  of  a  unique  minimizing 
positive  definite  symmetric  matrix  P^,  given  by 

P x  =  W~l/2  {wxJ2  Wa  Wc1/: 2) 1/2  1 V~1/2  (E.13) 

Then  =  P^2,  and  =  T^1  is  the  unique  symmetric  positive  definite  bal¬ 
ancing  transformation  for  (A,  B ,  C ).  The  gradient  flow  P(t )  =  —V  ip  ( P(t ))  on  the 
class  of  symmetric  positive  definite  matrices  is  given  by 

P  =  P"1  W0P~l  -  Wc  P{ 0)  =  P0  (E.14) 

For  every  initial  condition  Pq  =  P0J  >  0,  P(t)  exists  for  all  t  >  0  and  converges 
exponentially  fast  to  P^  as  t  — >  oo. 
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Appendix  F 


Proof  of  Theorem  4.2.15 


Our  proof  for  Theorem  4.2.15  is  based  on  the  framework  of  continuous  time  optimal 


control.  Consider  the  dynamical  system 

x{t)  =  f(x(t),u(t)) ,  t0  <t  <T  (F.l) 

x(to)  =  x0  (F.2) 

where  u(-)  G  U  and 

U  =  {u  :  [0,T]  — >  Rm  :  u(-)  is  measurable}  (F.3) 

is  the  set  of  admissible  controls.  Define  the  cost  function 

Jxo,t0(u(-))  =  [  P(x(t),u(t))dt  +  j(x(T))  (F.4) 

Jto 

and  the  value  function 

V(x0,to)=  inf  Jxo,to(u(-))  (F.5) 

u(-)eu 


We  use  the  following  result  presented  by  Evans  [45]  which  states  that  the  value 
function  V  satisfies  a  nonlinear  Hamilton-Jacobi-Bellman  PDE. 
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Theorem  F.0.30  (Evans  [45])  The  value  function  V  is  a  weak  solution  of  the 
Hamilton- Jacobi- Bellman  PDE 


dV  ( dV  \ 

at  (x,t)  +  H(gx(x,t),x)=  0 

(F.6) 

with  boundary  condition 

V(x(T),T)  =  i(z(T)) 

(F.7) 

where  the  Hamiltonian  H  is  given  by 

H(p,  x)  =  min  {f(x,  u)  ■  p  +  (3(x,  n)} 

uGU 

(F.8) 

Theorem  4.2.15  (Scherpen) 

Proof 

To  derive  (4.14),  fix  to  =  0  and  let  T  — »  —  oo.  Then  in  the  framework  of  Evans, 

Lc(x o)  corresponds  to  V(xa,  0), 

r—oo  ^ 

«4o,oM-))  =  -  /  o  u(t)  dt 

Jo  2 

(F.9) 

/3(x,  u )  =  — -  uT  u 

(F-10) 

7  (X(T))  =  0 

(F.ll) 

and 

H(p,  x)  =  min  | pT(f  +  gu)  - 

(F.12) 

where  pT  corresponds  to  d  Lc/d  x  (x).  The  expression  for  H 

u  =  gT  p  so  that 

is  minimized  when 

H  (p,  x)  =  pT  f  +  l;pT  ggT  p- 

(F-13) 

Also,  for  u  =  gT  p  we  have 

f)T  T 

x  =  f(x)  +  g(x)  gT(x)  ^  ^  (x). 

(F.14) 
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Using  the  hypothesis  that  0  is  an  asymptotically  stable  equilibrium  of  —  (/  + 
g  gT  dT Lc/dx)  we  get  x  — *  0  as  t  — *  — oo.  Finally,  since  t0  is  fixed  we  have 


dLC/  . 

at  M  =  0 

(F.15) 

and  (F.6)  becomes 

c)Lr  .  .  .  . 

1  or 

JL  C/  1  J  .  .  ,  .  '  |  1  .  . 

dTLc 

(F.16) 

ax (x)  /(x)  + 

2  ax  ^  9  ^ 

with  boundary  condition 

Lc( 0)  =  0 

(F.17) 

which  is  the  desired  equation. 

To  derive  (4.15)  we  restrict  the  admissible  controls  to  the  singleton  set  U\  = 
{u  :  u  =  0  ,  0  <  t  <  cx)}.  This  would  not  be  interesting  for  a  real  optimal  control 

problem  and  is  merely  a  device  so  that  we  can  use  the  Evans  framework.  The 
observability  function  can  then  be  written  as 

L0(x o)  =  min  j  J  hT (x(t))  h(x(t))  dt ,  :  u  G  U\  ,  x(0)  =  Xo  j  (F-18) 

Now  fix  to  =  0  and  let  T  — »  — oo.  In  the  Evans  framework  L0(x o)  corresponds  to 
V (xa,  0), 

r—oo  1 

Jx0,o{u(-))=  —  hT (t)  h{t)  dt  (F.19) 

Jo  2 

P(x,  u )  =  — -  hT  h  (F.20) 

7  (x{T))  =  0  (F.21) 

and 

H(p,  x)  =  min  | pT(f  +  gu )  +  hnT h|  (F.22) 

where  pT  corresponds  to  d  La/dx  (x).  Since  U\  is  trivial  the  minimization  is  trivial 
resulting  in 

H(p,  x)  =  pT  f  +  -  hT  h.  (F.23) 

2 
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Using  the  hypothesis  that  0  is  an  asymptotically  stable  equilibrium  for  x  =  f(x) 
we  have  x  — *  0  as  t  — *  oo.  Finally,  since  t0  is  fixed  (F.6)  becomes 


0Lo 

dx 


(®)  /(*)  + 


^  hT(x)  h(x)  =  0 


(F.24) 


with  boundary  condition 


L0(  0)  =  0 


(F  .25) 


which  is  the  desired  equation. 
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Appendix  G 


Physical  Constants 


Listed  here  are  the  physical  constants  used  in  the  models.  The  units  have  been 
selected  for  convenience  and  consistency.  Properties  of  the  wafer  are  those  of  pure 
silicon.  Chamber  wall  properties  are  those  of  quartz.  Properties  of  the  process 
gases  are  those  of  hydrogen  at  1000  K  and  1  ATM.  Chemical  kinetics  parameters 
are  those  experimentally  determined  from  reactions  involving  thermally  activated 
deposition  of  polysilicon  from  30  seem  of  2%  silane  in  hydrogen. 
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Constant 

Description 

Value 

Units 

k) 

Arrhenius  Coefficient 

3.0787  x  103 

cm  sec-1 

Ea 

Activation  Energy 

1.6330  x  105 

J  mol-1 

Rg 

Gas  Constant 

8.314 

J  mol"1  K-] 

href 

Reference  Thickness 

1.0  x  RT4 

cm 

Pr 

Rate  Pre-Exponential  Constant 

1.8472  x  109 

dimensionless 

Pe 

Rate  Exponential  Constant 

2.8059  x  101 

dimensionless 

hw 

Thermal  Conductivity  of  Wafer 

0.22 

W  cm-1  K“ 

1 

Pw 

Mass  Density  of  Wafer 

2.3 

g  cm  3 

CPW 

Heat  Capacity  of  Wafer 

2.3 

J  g-1  K-1 

<7b 

Boltzmann  Constant 

5.677  x  10-12 

W  cm"2  K" 

4 

Emissivity  of  Wafer 

0.7 

dimensionless 

Qtw 

Absorptivity  of  Wafer 

0.5 

dimensionless 

Rw 

Radius  of  Wafer 

7.62 

cm 

Az 

Thickness  of  Wafer 

0.05 

cm 

hy 

Convective  Heat  Transfer  Coeff 

2.6474  x  10-4 

W  cm’2  K“ 

1 

Re 

Reynolds  Number  of  Gas  Flow 

27.2 

dimensionless 

kg 

Thermal  Conductivity  of  Gas 

4.40  x  10~3 

W  cm’1  K" 

1 

Pr 

Prandtl  Number  of  Gas  Flow 

0.686 

dimensionless 

L 

Chamber  Length 

50.8 

cm 

T 

Chamber  Wall  (Ambient)  Temp 

700 

K 

ec 

Emissivity  of  Chamber  Wall 

0.37 

dimensionless 

T 

1g 

Gas  Temperature 

300 

K 

r 

Reference  Time 

60 

seconds 

Qref 

Reference  Heat  Flux 

29.24 

W  cm-2 
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Appendix  H 


View  Factor  Analysis  for  Lamp 
Heating  in  the  Epsilon- 1 


The  approach  we  take  to  determine  the  heat  flux  spatial  profiles  is  based  on  the 
concept  of  view  factor  [124,  145]  which  describes  the  radiation  exchange  between 
two  or  more  surfaces  separated  by  a  non-participating  medium  that  does  not  ab¬ 
sorb,  emit,  or  scatter  radiation.  The  view  factor  between  two  surfaces  represents 
the  fraction  of  radiative  energy  leaving  one  surface  that  strikes  the  other  surface 
directly. 

In  this  method,  the  geometry  of  the  chamber,  including  location  and  shape  of 
lamps,  susceptor,  reflectors,  and  possibly  other  apparatus,  is  what  determines  the 
form  of  the  resulting  flux  profiles.  This  geometric  approach  was  adopted  in  [63], 
where  the  authors  consider  only  a  2-dimensional  slice  of  the  chamber  geometry,  and 
includes  the  effect  of  reflectors  behind  the  lamp  banks.  There,  the  2-dimensional 
approach  was  reasonable,  perhaps,  since  the  lamp  arrangement  in  the  reactor  under 
consideration  was  axisymmetric  about  the  wafer  center.  This  situation  is,  however, 
not  the  case  in  the  Epsilon- 1  reactor.  Hence,  our  analysis  is  similar  to  that  used 
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ill  [41],  where  the  authors  consider  the  chamber  geometry  from  a  3-dimensional 
point  of  view.  However,  in  that  paper,  as  in  this  paper,  the  effect  of  reflectors  is 
not  included. 

Assumptions 

In  the  actual  reactor,  the  internal  surface  of  the  chamber  lid  is  gold  plated  to 
reflect  infrared  rays  from  the  linear  lamps,  and  the  spot  lamps  are  placed  in  gold 
plated  parabolic  reflectors.  However,  we  do  not  consider  the  effect  of  reflections  on 
the  lamp  heating  of  the  wafer.  In  addition,  the  literature  indicates  that  “virtual 
images”,  or  radiation  from  the  heated  wafer  to  the  reflectors  and  chamber  walls 
which  is  reflected  back  to  the  wafer,  will  cause  additional  radiative  effects.  These 
effects  are  not  included  in  the  analysis  here. 

We  consider  all  surfaces  to  be  diffuse  reflectors  and  diffuse  emitters.  Radiant 
intensity  from  the  lamps  is  assumed  to  be  independent  of  direction  and  constant 
across  the  length  of  the  lamp.  We  assume  that  the  quartz  walls  and  the  process 
gases  transmit  heat  radiation  from  the  lamps  perfectly  at  the  wavelengths  of  in¬ 
terest.  Furthermore,  we  assume  that  the  path  from  lamps  to  wafer  (or  lamps  to 
susceptor)  is  completely  free  of  any  other  obstacles. 

Lamp  Geometry 

Figure  H.l  shows  a  schematic  of  the  upper  lamp  array  superimposed  over  the 
susceptor  and  wafer,  which  is  based  on  a  description  and  diagram  provided  in  [9]. 
For  computational  purposes,  we  consider  each  linear  lamp  to  be  a  straight  line 
segment  of  length  11.5  inches  with  the  array  consisting  of  parallel  equally  spaced 
lamps.  The  array  begins  directly  above  the  susceptor  edge,  5.0  inches  (horizontally) 
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Radial  Position  Lamp 

On  Wafer  (inches)  Position 


5.00  I  I  5 


1 1 .5  inches 


Figure  H.l:  Geometry  (top  view)  of  upper  lamp  array:  radial  position  of  each  lamp 
is  given  in  inches  from  center;  lamps  are  identified  by  five  uniquely  distinguishable 
positions,  numbered  1  through  5. 


from  the  susceptor  center.  The  distance  between  neighboring  parallel  lamps  in 
the  array  is  1.25  inches.  The  vertical  distance  between  wafer  and  upper  lamp 
array  is  2.25  inches,  and  the  vertical  distance  between  wafer  and  lower  lamp  array 
is  3.50  inches.  We  note  that  the  distances  given  are  estimates  based  on  crude 
measurements  taken  on  the  reactor  itself. 

There  are  five  lamp  positions  for  the  linear  lamps  that  can  be  uniquely  differen¬ 
tiated  from  the  others.  This  is  due  to  the  wafer  rotation.  For  example,  two  linear 
lamps  equally  distant  from  the  center  linear  lamp  have  an  identical  irradiating  ef¬ 
fect  on  the  wafer  surface.  The  five  lamp  positions  are  numbered  1  through  5.  The 
spot  lamps  have  their  own  unique  geometry  and  are  analyzed  separately  later. 

The  source  of  radiation  for  each  lamp  is  a  tungsten  filament,  which  we  assume 
to  be  a  straight  line  segment  stretching  the  length  of  the  lamp.  Figure  H.2  shows 
the  geometry  used  to  perform  the  analysis.  We  assume  that  for  each  filament  the 
radiant  intensity  is  independent  of  direction  and  constant  across  the  length  of  the 
filament. 
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f2 


Figure  H.2:  Geometry  for  view  factor  analysis  used  to  calculate  heat  flux  intensity 
profiles  for  linear  lamps. 


For  each  point  on  the  wafer,  w,  there  is  an  irradiance  contribution  from  each 
point  on  the  filament,  /,  depending  upon  the  distance  between  them,  d  =  ||w  —  f\\, 
the  angle  6W  formed  by  the  vector  w  —  f  and  the  vector  nw  normal  to  the  wafer 
surface,  and  the  vertical  distance,  h,  from  wafer  surface  to  filament.  Note  that 
on  the  filament  diagram  the  endpoint  values  are  j\  =  —5.75  inches  and  /2  =  5.75 
inches,  and  the  vertical  distance  h  is  either  2.25  inches  for  the  upper  array  or  3.50 
inches  for  the  lower  array. 


View  Factor  Analysis 

For  the  derivation  of  the  expression  for  heat  flux  radiant  power  per  unit  area  on 
the  wafer  surface,  we  adopt  the  notation  used  in  [124],  The  rate  of  radiative  energy 
dQf  leaving  a  differential  surface  area  dAf  (containing  the  point  /)  on  the  filament 
that  strikes  a  differential  surface  area  dAw  (containing  the  point  w)  on  the  wafer 
surface  is  given  by 

dQf  =  dAf  If  cos(9f)  dojfw  (H.l) 
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where  If  is  the  intensity  of  radiative  energy  leaving  dAf  in  all  directions  in  hemi¬ 
spherical  space  (in  dimensions  of  Watts  per  unit  area  per  steradian),  Of  is  the  angle 
formed  by  the  vector  w  —  f  and  a  vector  normal  to  dAf,  and  du>fw  is  the  solid 
angle  subtended  by  dAw  from  /  given  by 

dAw  cos(Ow ) 


dujfw 

Substituting  (H.2)  into  (H.l)  yields 

dQf  =  dAf  If 


d? 


(H.2) 


cos  (Of)  cos(Ow )  dAu 

d- 2 


(H.3) 


Now,  the  rate  of  radiation  energy  Qf  leaving  the  surface  element  dAf  on  the 
filament  in  all  directions  over  hemispherical  space  is  [124] 


Qf  =  *  I,  dA, 


(H.4) 


The  elemental  view  factor  dFdAf~dAw  is  defined  as  the  ratio  of  the  radiative  energy 
leaving  dAf  that  strikes  dAw  directly  to  the  radiative  energy  leaving  dAf  in  all 
directions  into  the  hemispherical  space.  Thus,  we  divide  (H.3)  by  (H.4)  to  give  the 


view  factor 


d.F,1A;  — dA'ti; 


dQf 

Qf 


cos  (Of)  cos(Ow )  dAw 

7T  d? 


(H.5) 


Since  we  are  assuming  that  filament  radiant  intensity  is  independent  of  di¬ 


rection,  we  take  Of  =  0  independent  of  filament  position  /  so  that  cos(0f)  =  1 
and 

cos(Ow)  dAw 


dFdAf—d.Aw 


7T  d2 


(H.6) 


We  are  interested  in  the  radiative  energy  illuminating  a  differential  area  on  the 


wafer  due  to  the  entire  filament.  To  compute  the  appropriate  view  factor,  Ff_dAw , 
we  average  (H.6)  across  the  length  of  the  filament 


Ff-dAw 


dAw  rf 2  cos(Ow) 

|/i-/2|4  vr  d2 


(H.7) 
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Finally,  we  observe  that 


cos(9w )  =  - 

for  the  given  geometry,  so  that 

F,-dA- = 1/1  -  h\  L  ^ ?dl  (H-8) 

where  we  recall  that  d  =  ||u>  —  f\\. 

To  determine  the  radiant  heat  flux  profile  for  a  given  lamp,  the  view  factor 
Ff-dAw  must  be  computed  for  each  differential  area  dAw  on  the  wafer  surface. 
For  practical  purposes,  we  discretize  the  wafer  surface  by  choosing  a  cylindrical 
grid  of  wafer  points  w  =  (r,  0).  We  then  assume  that  the  differential  area,  dAw, 
is  constant  for  all  wafer  points  w.  Thus,  (H.8)  yields  the  view  factor  function 
Ff~dAw  (f,  0)  which  gives  the  fraction  of  radiative  energy  leaving  the  given  lamp 
filament  that  strikes  the  given  wafer  point  w  =  (r,  0)  directly. 

Now,  we  let  Pf  denote  the  radiant  power  supplied  by  the  filament,  so  that 
Pf/dAw  gives  the  radiant  heat  flux  intensity  striking  the  differential  area  dAw. 
The  radiant  heat  flux  intensity  profile  of  the  illumination  due  to  the  lamp  filament 
is  then  given  by 


9/0,0)  =  Ff-dAw  0, 0) 


Pfh 


P]_ 

dAw 

rh 


df 


(H.9) 

7T  |./i  -  /2  Jh  d(r,  0)3  ^  (H'10) 

where  the  value  we  use  for  Pf  is  provided  by  the  manufacturer.  In  the  case  of  the 
ASM  Epsilon-1  reactor,  the  linear  lamps  supply  6000  Watts  and  the  spot  lamps 


supply  1000  Watts. 

Since  the  wafer  rotates  at  a  uniform  rate,  this  function  is  averaged  over  the 
circle  (i.e.,  0  <  0  <  2n)  at  each  radial  position  r  on  the  wafer  top  surface 

qi(r)  =  ith  C  F^r’^  <H'n> 
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to  give  the  heat  flux  profile 


qf{r)  = 


Pth 


r  rh 


1 


df  d(f>. 


(H- 12) 


2ir2l/i  -  /2I  JO  Jh  d(r,<j>f 
A  similar  analysis  was  performed  for  the  spot  lamps,  except  that  each  spot 
lamp  was  considered  to  be  a  point  source  of  radiant  energy,  thus  simplifying  the 
analysis  significantly. 

The  computational  procedure  was  performed  for  each  of  the  five  different  linear 
lamp  positions  for  both  upper  and  lower  arrays,  and  for  the  spot  lamps.  Using 
the  resulting  heat  flux  intensity  spatial  profiles,  we  can  then  compute  the  desired 
profiles  for  each  of  the  ten  lamp  groups,  and  the  four  heat  zones  of  the  Epsilon-1 
reactor  by  appropriately  combining  the  profiles  determined  from  the  individual 
lamps. 


Results 

Here  we  discuss  some  results  of  the  analysis.  Note  that  in  what  follows,  the  term 
“wafer  surface”  may  represent  the  top  surface  of  the  wafer  and  exposed  susceptor, 
or  the  bottom  surface  of  the  susceptor,  depending  upon  the  lamp  group  being 
considered. 

Figures  H.3  and  H.4  show  the  heat  flux  irradiated  on  the  wafer  surface  by  lamps 
in  positions  1,  2,  3,  and  4  of  the  upper  and  lower  array,  respectively,  and  the  spot 
lamp  position.  As  expected,  points  on  the  wafer  surface  directly  under  (or  over) 
the  lamp  filament  receive  the  most  intense  illumination,  i.e.  the  maximum  flux 
value.  Intensities  are  greater  in  magnitude  for  lamps  in  the  upper  array  since  it 
is  physically  closer  to  the  wafer  than  the  lower  array  and  spot  lamps.  Spot  lamps 
have  lower  flux  intensities  than  linear  lamps  due  to  the  smaller  supplied  power. 

To  account  for  wafer  rotation,  the  flux  intensity  profiles  are  averaged  around 
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Linear  Lamp  -  Upper  Array  -  Pos  1  Linear  Lamp  -  Upper  Array  -  Pos  2 


Linear  Lamp  -  Lower  Array  -  Pos  1  Linear  Lamp  -  Lower  Array  -  Pos  2 


Figure  H.3:  Heat  flux  intensity  profiles  for  linear  lamps.  Top:  upper  lamp  array; 
Bottom:  lower  lamp  array;  (flux  intensity  (W/cm2)  versus  position  in  two  dimen¬ 
sions.  Upper  left:  Position  1;  Upper  right:  Position  2;  Lower  Left:  Position  3; 
Lower  right:  Position  4). 
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Spot  Lamp  Heat  Flux  Intensity 


Figure  H.4:  Heat  flux  intensity  profile  for  spot  lamp:  flux  intensity  (W/cm2)  versus 
position  in  two  dimensions. 

360  degrees  resulting  in  profiles  that  are  a  function  of  radial  position  only.  Fig¬ 
ure  H.5  shows  the  heat  flux  profiles,  after  averaging,  for  each  of  the  individual 
lamp  positions.  Observe  that  as  expected  the  lamp  position  directly  over  (or  un¬ 
der)  the  wafer  center  irradiates  the  wafer  center  with  greater  intensity  than  the 
other  positions.  Lamp  positions  closer  to  the  edge  irradiate  the  edge  with  greater 
intensity  than  they  irradiate  the  center. 

Figure  H.6  shows  the  heat  flux  profiles  for  each  of  the  ten  lamp  groups.  Fig¬ 
ure  H.7  shows  the  heat  flux  profiles  for  the  four  heating  zones  -  center,  front,  rear, 
and  side.  The  flux  intensity  for  the  center  zone  is  significantly  greater  than  for 
the  others,  indicating  that  it  will  have  the  greatest  heating  effect.  Observe  that 
profiles  for  front  and  rear  zones  are  identical  due  to  the  symmetry  assumptions 
and  the  way  in  which  the  individual  lamps  are  organized  to  form  the  zones. 
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Upper  Lamp  Array  -  Individual  Lamps  -  Heat  Flux  Intensity  Profiles 


Lower  Lamp  Array  -  Individual  Lamps  -  Heat  Flux  Intensity  Profiles 


Figure  H.5:  Heat  flux  intensity  profiles  for  individual  lamps.  Top:  upper  array; 
Bottom:  lower  array;  (flux  intensity  (W/cm2)  versus  radial  position  for  the  five 
uniquely  distinguishable  linear  lamp  positions  and  the  spot  lamp  position). 
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Lamp  Group  1 


Lamp  Group  2 
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^4  _2  Lamp  @roup  82  4 


Figure  H.6:  Heat  flux  intensity  profiles  for  ASM  Epsilon-1  lamp  groups:  flux 
intensity  (W/cm2)  versus  radial  position  for  the  ten  lamp  groups. 


Radial  Position  (inches) 


Figure  H.7:  Heat  flux  intensity  profiles  for  ASM  Epsilon-1  heat  zones:  flux  inten¬ 
sity  (W/cm2)  versus  radial  position  for  the  four  heat  zones. 
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Appendix  I 


PHOENICS  Q1  Source  File  for 
Epsilon-1  Poly-Si  Growth 
Simulation 


This  appendix  contains  the  input  code,  called  a  Q1  hie,  for  the  PHOENICS  CFD 
software  package.  The  hie  was  used  to  simulate  deposition  of  poly-Si  on  a  silicon 
wafer  with  wafer  temperature  of  750  C,  silane  how  rate  of  30  seem,  and  chamber 
pressure  of  20  Torr.  Input  hies  for  simulations  with  other  process  conditions  are 
similar. 

CPVNAM=CVD 

IRUNN  =  1  ; LIBREF  =  14 

Group  1.  Run  Title 

TEXT (POLY-SI  DEP  750  C  30  seem  SiH4  20  Torr  ) 

Group  2.  Transience 

STEADY  =  T 
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Groups  3,  4,  5  Grid  Information 

*  Overall  number  of  cells,  RSET(M, NX, NY, NZ, tolerance) 
RSET (M , 25 , 27 , 52) 

*  Set  overall  domain  extent: 

*  xulast  yvlast  zwlast 
name 

XSI=  1 . 651000E-01 ;  YSI=  1 . 094800E-01 ;  ZSI=  6.068000E-01 
RSET (D, EPS 1  ) 


Group  6.  Body-Fitted  coordinates 
BFC=T 

*  Set  points 

XP0=  7 . 6200E-02 ; YP0=-4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , PW1  ) 
XP0=  7 . 6200E-02 ; YP0=  4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , PW2  ) 
XP0=  0 . 0000E+00 ; YP0=-5 . 4740E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , POO  ) 
XP0=  0 . 0000E+00 ; YP0=-4 . 8740E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P01  ) 
XP0=  0 . 0000E+00 ; YP0=-4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P02  ) 
XP0=  0 . 0000E+00 ; YP0=  4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P03  ) 
XP0=  0 . 0000E+00 ; YP0=  4 . 8740E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P04  ) 
XP0=  0 . 0000E+00 ; YP0=  5 . 4740E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P05  ) 
XP0=  1 . 1125E-01 ; YP0=-2 . 97 18E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P10  ) 
XP0=  1 . 1125E-01 ; YP0=-2 . 37 19E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , Pll  ) 
XP0=  1. 1125E-01;YP0=-4.7000E-03;ZP0=  0 . 0000E+00 ; GSET (P , P12  ) 
XP0=  1 . 1125E-01 ; YP0=  4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P13  ) 
XP0=  1 . 1125E-01 ; YP0=  2 . 3719E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P14  ) 
XP0=  1. 1125E-01;YP0=  2 . 9718E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P15  ) 
XP0=  1 . 1240E-01 ; YP0=-2 . 9506E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P20  ) 
XP0=  1 . 1240E-01 ; YP0=-2 . 3198E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P21  ) 
XP0=  1 . 1240E-01 ; YP0=-4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P22  ) 
XP0=  1 . 1240E-01 ; YP0=  4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P23  ) 
XP0=  1 . 1240E-01 ; YP0=  2 . 3198E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P24  ) 
XP0=  1 . 1240E-01 ; YP0=  2 . 9506E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P25  ) 
XP0=  1 . 2581E-01 ; YP0=-2 . 7021E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P30  ) 
XP0=  1 . 2581E-01 ; YP0=-1 . 7092E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P31  ) 
XP0=  1 . 2581E-01 ; YP0=-4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P32  ) 
XP0=  1 . 2581E-01 ; YP0=  4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P33  ) 
XP0=  1 . 2581E-01 ; YP0=  1 . 7092E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P34  ) 
XP0=  1 . 2581E-01 ; YP0=  2 . 7021E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P35  ) 
XP0=  1 . 3030E-01 ; YP0=-2 . 6188E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P40  ) 
XP0=  1 . 3030E-01 ; YP0=-1 . 5045E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P41  ) 
XP0=  1 . 3030E-01 ; YP0=-4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P42  ) 
XP0=  1 . 3030E-01 ; YP0=  4 . 7000E-03 ; ZP0=  0 . 0000E+00 ; GSET (P , P43  ) 
XP0=  1 . 3030E-01 ; YP0=  1 . 5045E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P44  ) 
XP0=  1 . 3030E-01 ; YP0=  2 . 6188E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P45  ) 
XP0=  1 . 4605E-01 ; YP0=-2 . 3270E-02 ; ZP0=  0 . 0000E+00 ; GSET (P , P50  ) 
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XPO=  1 . 4605E-01 ; YP0=-7 . 8750E-03 ; ZPO= 
XPO=  1 . 4605E-01 ; YP0=-4 . 7000E-03 ; ZPO= 
XPO=  1 . 4605E-01 ; YPO=  4 . 7000E-03 ; ZPO= 
XPO=  1 . 4605E-01 ; YPO=  7 . 8750E-03 ; ZPO= 
XPO=  1 . 4605E-01 ; YPO=  2 . 3270E-02 ; ZPO= 
XPO=  1 . 6510E-01 ; YP0=-1 . 9740E-02 ; ZPO= 
XPO=  1 . 6510E-01 ; YP0=-7 . 8750E-03 ; ZPO= 
XPO=  1 . 6510E-01 ; YP0=-4 . 7000E-03 ; ZPO= 
XPO=  1 . 6510E-01 ; YPO=  4 . 7000E-03 ; ZPO= 
XPO=  1 . 6510E-01 ; YPO=  7 . 8750E-03 ; ZPO= 
XPO=  1 . 6510E-01 ; YPO=  1 . 9740E-02 ; ZPO= 
XPO=  5 . 0000E-02 ; YP0=-5 . 0929E-02 ; ZPO= 
XP0=  5 . 0000E-02 ; YP0=-4 . 4929E-02 ; ZP0= 
XP0=  5 . 0000E-02 ; YP0=  4 . 4929E-02 ; ZP0= 
XP0=  5 . 0000E-02 ; YP0=  5 . 0929E-02 ; ZP0= 
*  Set  lines/arcs 
GSET (L ,LVW, PW1 , PW2 ,5,1.0) 

GSET (L ,LV01 , POO , P01 , 1 , 1 . 0) 

GSET (L ,LV02 , P01 ,P02 , 8 , .7) 

GSET (L , LV03 , P02 , P03 , 5 , 1) 

GSET (L , LV04 , P03 , P04 , 12 , 1 . 5) 

GSET (L , LV05 , P04 , P05 , 1 , 1 . 0) 

GSET (L,LV11,P10,P11, 1,1.0) 

GSET (L,LV12,P11,P12,8, .7) 

GSET (L,LV13,P12 ,P13 , 5 , 1 . 0) 

GSET (L ,LV14 , P13 , P14 , 12 , 1 . 5) 
GSET(L,LV15,P14,P15, 1,1.0) 

GSET (L,LV21,P20,P21, 1,1.0) 

GSET (L ,LV22 , P21 ,P22 , 8 , .7) 

GSET (L , LV23 , P22 , P23 , 5 , 1 . 0) 

GSET (L , LV24 , P23 , P24 , 12 , 1 . 5) 

GSET (L , LV25 , P24 , P25 , 1 , 1 . 0) 

GSET (L,LV31,P30,P31, 1,1.0) 

GSET (L ,LV32 , P31 , P32 , 8 , .7) 

GSET (L , LV33 , P32 , P33 , 5 , 1 . 0) 

GSET (L , LV34 , P33 , P34 , 12 , 1 . 5) 

GSET (L , LV35 , P34 , P35 , 1 , 1 . 0) 

GSET (L,LV41,P40,P41, 1,1.0) 

GSET (L ,LV42 , P41 ,P42 , 8 , .7) 

GSET (L , LV43 , P42 , P43 , 5 , 1 . 0) 

GSET (L , LV44 , P43 , P44 , 12 , 1 . 5) 

GSET (L , LV45 , P44 , P45 , 1 , 1 . 0) 

GSET (L,LV51,P50,P51, 1,1.0) 

GSET (L ,LV52 , P51 ,P52 , 8 , .7) 

GSET (L , LV53 , P52 , P53 , 5 , 1 . 0) 


0 . 0000E+00 ; GSET (P , P51  ) 
0 . 0000E+00 ; GSET (P , P52  ) 
0 . 0000E+00 ; GSET (P , P53  ) 
0 . 0000E+00 ; GSET (P , P54  ) 
0 . 0000E+00 ; GSET (P , P55  ) 
0 . 0000E+00 ; GSET (P , P60  ) 
0 . 0000E+00 ; GSET (P , P61  ) 
0 . 0000E+00 ; GSET (P , P62  ) 
0 . 0000E+00 ; GSET (P , P63  ) 
0 . 0000E+00 ; GSET (P , P64  ) 
0 . 0000E+00 ; GSET (P , P65  ) 
0 . 0000E+00 ; GSET (P , PA0  ) 
0 . 0000E+00 ; GSET (P , PA1  ) 
0 . 0000E+00 ; GSET (P , PA2  ) 
0 . 0000E+00 ; GSET (P , PA3  ) 
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GSET (L , LV54 , P53 , P54 , 12 , 1 . 5) 

GSET (L , LV55 , P54 , P55 , 1 , 1 . 0) 

GSET (L,LV61,P60,P61, 1,1.0) 

GSET (L ,LV62 , P61 ,P62 , 8 , .7) 

GSET (L , LV63 , P62 , P63 , 5 , 1 . 0) 

GSET (L , LV64 , P63 , P64 , 12 , 1 . 5) 

GSET (L , LV65 , P64 , P65 , 1 , 1 . 0) 

GSET (L , A00 , POO , P10 , 12 , 1 . 0 , ARC , PAO) 
GSET (L,A01,P01,P11,12,1.0, ARC , PA1) 
GSET (L, A04,P04,P14, 12,1.0, ARC , PA2) 
GSET (L, A05 ,P05 ,P15 ,12,1.0, ARC , PA3) 
GSET (L ,LH01 ,P02 , PW1 , 8 , 1 . 0) 

GSET (L ,LH02 , PW1,P12,4,1.0) 

GSET (L , LH03 ,P03 ,PW2 ,8,1.0) 

GSET (L , LH04 ,PW2 ,P13 ,4, 1 . 0) 

GSET (L,LH10,P10,P20, 1,1.0) 

GSET (L,LH11,P11,P21, 1,1.0) 

GSET (L ,LH12 , P12 , P22 ,1,1.0) 

GSET (L,LH13,P13,P23, 1,1.0) 

GSET (L , LH14 , P 14 , P24 , 1 , 1 . 0) 

GSET (L,LH15,P15,P25, 1,1.0) 

GSET (L , LH20 , P20 , P30 , 4 , 1 . 0) 

GSET (L ,LH21 ,P21 , P31 ,4 , 1 . 0) 

GSET (L , LH22 , P22 , P32 , 4 , 1 . 0) 

GSET (L , LH23 , P23 , P33 , 4 , 1 . 0) 

GSET (L , LH24 , P24 , P34 , 4 , 1 . 0) 

GSET (L , LH25 , P25 , P35 ,4,1.0) 

GSET (L , LH30 , P30 , P40 , 4 , 1 . 0) 

GSET (L ,LH31 ,P31 ,P41 ,4 , 1 . 0) 

GSET (L , LH32 , P32 , P42 , 4 , 1 . 0) 

GSET (L , LH33 , P33 , P43 , 4 , 1 . 0) 

GSET (L , LH34 , P34 , P44 , 4 , 1 . 0) 

GSET (L , LH35 , P35 , P45 , 4 , 1 . 0) 

GSET (L , LH40 , P40 , P50 , 3 , 1 . 0) 

GSET (L , LH41 , P41 , P51 , 3 , 1 . 0) 

GSET (L , LH42 , P42 , P52 , 3 , 1 . 0) 

GSET (L , LH43 , P43 , P53 , 3 , 1 . 0) 

GSET (L , LH44 , P44 , P54 , 3 , 1 . 0) 

GSET (L , LH45 , P45 , P55 , 3 , 1 . 0) 

GSET (L , LH50 , P50 , P60 , 1 , 1 . 0) 

GSET (L,LH51,P51,P61, 1,1.0) 

GSET (L , LH52 , P52 , P62 , 1 , 1 . 0) 

GSET (L , LH53 , P53 , P63 , 1 , 1 . 0) 

GSET (L , LH54 , P54 , P64 , 1 , 1 . 0) 

GSET (L , LH55 , P55 , P65 , 1 , 1 . 0) 
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*  Set  frames 

GSET (F , F01 , POO ,-,P10,-,Pll,-,P01,-) 
GSET (F , F02 , POl , - , PI 1 , - , P12 , PW1 , P02 , -) 
GSET (F , F03 1 , P02 , - , PW1 , - , PW2 , - , P03 , -) 
GSET (F , F032 , PW1 , - , P12 , - , P13 , - , PW2 , -) 
GSET (F , F04 , P03 , PW2 , P13 , - , P14 , - , P04 , -) 
GSET (F , F05 , P04 , - , P14 , - , P15 , - , P05 , -) 
GSET (F,F11,P10,-, P20 ,-,P21,-,Pll,-) 
GSET (F,F12,P11,-,P21,-, P22 , - ,P12 , -) 
GSET (F , F13 , P12 , - , P22 , - , P23 , - , P13 , -) 
GSET (F , F14 , P 13 , - , P23 , - , P24 , - , P 14 , -) 
GSET (F , F15 , P14 , - , P24 , - , P25 , - , P15 , -) 
GSET (F , F21 , P20 , - , P30 , - , P31 , - , P21 , -) 
GSET (F , F22 , P21 , - , P31 , - , P32 , - , P22 , -) 
GSET (F , F23 , P22 , - , P32 , - ,P33 , - , P23 , -) 
GSET (F , F24 , P23 , - , P33 , - , P34 , - , P24 , -) 
GSET (F , F25 , P24 , - , P34 , - , P35 , - , P25 , -) 
GSET (F , F31 , P30 , - , P40 , - , P41 , - , P31 , -) 
GSET (F , F32 , P31 , - , P41 , - , P42 , - , P32 , -) 
GSET (F , F33 , P32 , - , P42 , - , P43 , - , P33 , -) 
GSET (F , F34 , P33 , - , P43 , - , P44 , - , P34 , -) 
GSET (F , F35 , P34 , - , P44 , - , P45 , - , P35 , -) 
GSET (F , F41 , P40 , - , P50 , - , P51 , - , P41 , -) 
GSET (F , F42 , P41 , - , P51 , - , P52 , - , P42 , -) 
GSET (F , F43 , P42 , - , P52 , - , P53 , - , P43 , -) 
GSET (F , F44 , P43 , - , P53 , - , P54 , - , P44 , -) 
GSET (F , F45 , P44 , - , P54 , - , P55 , - , P45 , -) 
GSET (F , F5 1 , P50 , - , P60 , - , P61 , - , P51 , -) 
GSET (F , F52 , P5 1 , - , P6 1 , - , P62 , - , P52 , -) 
GSET (F , F53 , P52 , - , P62 , - , P63 , - , P53 , -) 
GSET (F , F54 , P53 , - , P63 , - , P64 , - , P54 , -) 
GSET (F , F55 , P54 , - , P64 , - , P65 , - , P55 , -) 

*  Match  a  grid  mesh 
GSET (M , FOl , +1+ J ,1,1,1, LAP5) 

GSET (M , F02 , +1+ J , 1 , 2 , 1 , LAP5) 

GSET (M , F03 1 , +1+ J , 1 , 10 , 1 , LAP5) 

GSET (M , F032 , +1+ J ,9,10,1, LAP5) 

GSET (M , F04 , +1+ J , 1 , 15 , 1 , LAP5) 

GSET (M , F05 , +1+ J , 1 , 27 , 1 , LAP5) 

GSET (M , FI 1 , +1+ J ,13,1,1, LAP5) 

GSET (M,F12,+I+J, 13,2, 1,LAP5) 

GSET (M,F13, +1+ J ,13,10,1, LAP5) 

GSET (M,F14, +1+ J ,13,15,1, LAP5) 

GSET (M,F15, +1+ J ,13,27,1, LAP5) 

GSET (M , F21 , +1+ J ,14,1,1, LAP5) 
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GSET (M , F22 , +1+ J , 14 , 2 , 1 , LAP5) 

GSET (M , F23 , +1+ J ,14,10,1, LAP5) 

GSET (M , F24 , +1+ J ,14,15,1, LAP5) 

GSET (M , F25 , +1+ J , 14 , 27 , 1 , LAP5) 

GSET (M , F31 , +1+ J ,18,1,1, LAP5) 

GSET (M , F32 , +1+ J ,18,2,1, LAP5) 

GSET (M , F33 , +1+ J ,18,10,1, LAP5) 

GSET (M , F34 , +1+ J ,18,15,1, LAP5) 

GSET (M , F35 , +1+ J ,18,27,1, LAP5) 

GSET (M , F41 , +1+ J ,22,1,1, LAP5) 

GSET (M , F42 , +1+ J , 22 , 2 , 1 , LAP5) 

GSET (M , F43 , +1+ J ,22,10,1, LAP5) 

GSET (M , F44 , +1+ J ,22,15,1, LAP5) 

GSET (M , F45 , +1+ J , 22 , 27 , 1 , LAP5) 

GSET (M , F51 , +1+ J ,25,1,1, LAP5) 

GSET (M , F52 , +1+ J , 25 , 2 , 1 , LAP5) 

GSET (M , F53 , +1+ J ,25,10,1, LAP5) 

GSET (M , F54 , +1+ J ,25,15,1, LAP5) 

GSET (M , F55 , +1+ J , 25 , 27 , 1 , LAP5) 

*  Copy/Transfer/Block  grid  planes 

GSET (C , K1 1 ,F,K1, 1,25, 1,27, +,0,0,1. 2306E-01 , INC , 1) 
GSET (C , K15 , F ,K11 , 1 , 25 , 1 , 27 ,+ , 0 ,0 ,4 . 4960E-03 , INC , 1) 
GSET (C , K19 , F , K15 , 1 , 25 , 1 , 27 , + , 0 , 0 , 1 . 9960E-02 , INC , 1) 
GSET (C , K23 , F , K19 , 1 , 25 , 1 , 27 , + , 0 , 0 , 3 . 6195E-02 , INC , 1) 
GSET (C , K3 1 , F , K23 , 1 , 25 , 1 , 27 , + , 0 , 0 , 1 . 5240E-01 , INC , 1 ) 
GSET (C , K35 , F , K3 1 , 1 , 25 , 1 , 27 , + , 0 , 0 , 3 . 6 195E-02 , INC , 1 ) 
GSET (C , K39 , F , K35 , 1 , 25 , 1 , 27 , + , 0 , 0 , 3 . 8100E-02 , INC , 1) 
GSET (C , K43 , F , K39 , 1 , 25 , 1 , 27 , + , 0 , 0 , 4 . 4960E-03 , INC , 1 ) 
GSET (C , K53 , F , K43 , 1 , 25 , 1 , 27 , + , 0 , 0 , 1 . 9189E-01 , INC , 1) 

NONORT  =  T 

*  X-cyclic  boundaries  switched 

Group  7.  Variables:  STOREd , SOLVEd , NAMEd 
ONEPHS  =  T 

*  Non-default  variable  names 

NAME(  16)  =S80  ;  NAME(  17)  =S140 

NAME(  18)  =S142  ;  NAME(  19)  =S145 

NAME(  20)  =S147  ;  NAME(  21)  =S158 

NAME (141)  =BLOK  ;  NAME(142)  =WCRT 

NAME (143)  =VCRT  ;  NAME (144)  =UCRT 

NAME (145)  =TEM1  ;  NAME (146)  =DEPO 

NAME (147)  =PRPS  ;  NAME (148)  =ENUL 

NAME (149)  =RH01  ;  NAME(150)  =EMIS 

*  Solved  variables  list 


363 


SOLVE (PI  ,U1  ,V1  ,W1  ,3140,3142,3145,3147) 

SOLVE (S 158 , TEM1) 

*  Stored  variables  list 

STORE (EMIS , RHO 1 , ENUL , PRPS , DEPO ,UCRT , VCRT , WCRT) 
STORE (BLOK, S80  ) 

*  Additional  solver  options 
SOLUTN (PI  ,Y,Y,Y,N,N,Y) 

SOLUTN (S 140 ,Y,Y,Y,N,N,Y) 

SOLUTN (S 142 ,Y,Y,Y,N,N,Y) 

SOLUTN (S 145 ,Y,Y,Y,N,N,Y) 

SOLUTN (S 147 ,Y,Y,Y,N,N,Y) 

SOLUTN (S158,Y,Y,Y,N,N,Y) 

SOLUTN (TEM1,Y,Y,Y,N,N,Y) 

IVARBK  =  -1  ; ISOLBK  =  1 


Group  8 . 

Terms  \&  Devices 

DIFCUT  = 

0. OOOE+OO 

NEWRH1  = 

T 

NEWENL  = 

T 

UDIFNE  = 

T 

USOURC  = 

T 

ISOLX 

0  ; ISOLY  = 

0  ; ISOLZ 

0 


Group  9.  Properties 
RH01  =  GRND10 
PRESSO  =  2 . 631E+03 
TMP1A  =  2 . 930E+02  ;TMP1B 
CPI  =  GRND10 
ENUL  =  GRND10  ;ENUT 
PRNDTL (S140)  =  -GRND8 
PRNDTL (S145)  =  -GRND8 
PRNDTL (S158)  =  -GRND8 
TMP1A  =  2 . 930E+02 


=  0 . OOOE+OO 

=  0. OOOE+OO 
PRNDTL (S 142) 
PRNDTL (S 147) 
PRNDTL (TEM1) 


; TMP1C  = 


=  -GRND8 
=  -GRND8 
=  -GRND10 


0. OOOE+OO 


*  List  of  user-defined  materials  to  be  read  by  EARTH 
MATFLG=T ; IMAT=1 

*  Name 


*Ind.  Dens.  Viscos.  Spec. heat  Conduct.  Expans.  Compr. 
*  <GAS\_MIXTURE> 


70  GRND8  GRND8  GRND8  GRND8  1.000  0.000 
*  constants  for  GRND  option  no  1 


0  0 

*  constants  for  GRND  option  no  2 

0  0 
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constants  for  GRND  option  no  3 


* 

0  0 

*  constants  for  GRND  option  no  4 

0  0 

Group  10 . Inter-Phase  Transfer  Processes 

Group  11 . Initialise  Var/Porosity  Fields 
FIINIT (W1  )  =  1 . 000E+00  ;FIINIT(S140)  =  2.185E-02 
FIINIT (BLOK)  =  1 . 000E+00  ;FIINIT(TEM1)  =  2 . 980E+02 
FIINIT (PRPS)  =  7 . 000E+01 

C0NP0R(T0P  ,  -1.00, CELL  , -\#1 , -\#6 , -\#5 , -\#5 , -\#1 , -\#9) 
INIT (TOP  , BLOK,  0.000E+00,  2.000E+00) 

INIT (TOP  , PRPS ,  0 . 000E+00 ,  1.060E+02) 

C0NP0RCB0T  ,  -1.00, CELL  , -\#1 , -\#6 , -\#1 , -\#1 , -\#1 , -\#9) 
INIT (BOT  , BLOK,  0.000E+00,  3.000E+00) 

INIT (BOT  , PRPS ,  0 . 000E+00 ,  1.060E+02) 

C0NP0R(SIDE  ,  -1.00, CELL  , -\#7 , -\#7 , -\#1 , -\#5 , -\#1 , -\#9) 
INIT (SIDE  , BLOK,  0.000E+00,  4.000E+00) 

INIT (SIDE  , PRPS ,  0 . 000E+00 ,  1.060E+02) 

C0NP0R(SHF  ,  -1.00, CELL  , -\#1 , -\#6 , -\#3 , -\#3 , -\#1 , -\#1) 
INIT (SHF  , BLOK,  0.000E+00,  5.000E+00) 

INIT (SHF  , PRPS ,  0 . 000E+00 ,  1.060E+02) 

C0NP0R(SHR  ,  -1.00, CELL  , -\#1 , -\#6 , -\#3 , -\#3 , -\#9 , -\#9) 
INIT (SHR  , BLOK,  0.000E+00,  6.000E+00) 

INIT (SHR  , PRPS ,  0 . 000E+00 ,  1.060E+02) 

C0NP0R(SHS  ,  -1.00, CELL  , -\#6 , -\#6 , -\#3 , -\#3 , -\#2 , -\#8) 
INIT (SHS  , BLOK,  0.000E+00,  7.000E+00) 

INIT (SHS  , PRPS ,  0 . 000E+00 ,  1.060E+02) 

C0NP0R(RNGF  ,  -1.00, CELL  , -\#1 , -\#4 , -\#3 , -\#3 , -\#3 , -\#3) 
INIT (RNGF  , BLOK,  0.000E+00,  8.000E+00) 

INIT (RNGF  , PRPS ,  0.000E+00,  1.110E+02) 

C0NP0R(RNGR  ,  -1.00, CELL  , -\#1 , -\#4 , -\#3 , -\#3 , -\#7 , -\#7) 
INIT (RNGR  , BLOK,  0.000E+00,  9.000E+00) 

INIT (RNGR  , PRPS ,  O.OOOE+OO,  1.110E+02) 

CONPOR(RNGS  ,  -1.00, CELL  , -\#4 , -\#4 , -\#3 , -\#3 , -\#4 , -\#6) 
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INIT (RNGS 
INIT (RNGS 


BLOK,  0 . OOOE+OO ,  l.OOOE+Ol) 
PRPS ,  0. OOOE+OO,  1 . 110E+02) 


CONPOR(SUSF  ,  -1.00, CELL  , -\#1 , -\#3 , -\#3 , -\#3 , -\#4 , -\#4) 
INIT (SUSF  , BLOK,  0. OOOE+OO,  1.100E+01) 

INIT (SUSF  , PRPS ,  0. OOOE+OO,  1.110E+02) 

CONPOR(SUSR  ,  -1.00, CELL  , -\#1 , -\#3 , -\#3 , -\#3 , -\#6 , -\#6) 
INIT (SUSR  , BLOK,  0. OOOE+OO,  1.200E+01) 

INIT (SUSR  , PRPS ,  0. OOOE+OO,  1.110E+02) 

CONPOR(SUSS  ,  -1.00, CELL  , -\#2 , -\#3 , -\#3 , -\#3 , -\#5 , -\#5) 
INIT (SUSS  , BLOK,  0. OOOE+OO,  1.300E+01) 

INIT (SUSS  , PRPS ,  0. OOOE+OO,  1.110E+02) 

CONPOR(WAF  ,  -1.00, CELL  , -\#1 , -\#1 , -\#3 , -\#3 , -\#5 , -\#5) 
INIT (WAF  , BLOK,  0. OOOE+OO,  1.400E+01) 

INIT (WAF  , PRPS ,  0. OOOE+OO,  1.110E+02) 

INIADD  =  F 

Group  12.  Convection  and  diffusion  adjustments 
No  PATCHes  used  for  this  Group 

Group  13.  Boundary  \&  Special  Sources 

INLET  (BFCIN1  ,LOW  , \#1 , \#6 , \#4 , \#4 , \#1 , \#1 , \#1 , \#1) 

VALUE  (BFCIN1  ,P1  ,  GRND1  ) 

VALUE  (BFCIN1  ,U1  ,  GRND1  ) 

VALUE  (BFCIN1  ,V1  ,  GRND1  ) 

VALUE  (BFCIN1  ,W1  ,  GRND1  ) 

VALUE  (BFCIN1  ,S140,  2.185E-02) 

VALUE  (BFCIN1  ,WCRT,  1.400E+00) 

VALUE  (BFCIN1  ,TEM1 ,  2.980E+02) 

INLET  (BFCIN2  ,LOW  , \#1 , \#6 , \#2 , \#2 , \#1 , \#1 , \#1 , \#1) 

VALUE  (BFCIN2  ,P1  ,  GRND1  ) 

VALUE  (BFCIN2  ,U1  ,  GRND1  ) 

VALUE  (BFCIN2  ,V1  ,  GRND1  ) 

VALUE  (BFCIN2  ,W1  ,  GRND1  ) 

VALUE  (BFCIN2  ,WCRT,  4.500E-01) 

VALUE  (BFCIN2  ,TEM1,  2.980E+02) 

PATCH  (0U1  .HIGH  , \#1 , \#6 , \#4, \#4, \#9 , \#9 , \#1 , \#1) 

COVAL  (0U1  ,P1  ,  FIXVAL  ,  0. OOOE+OO) 
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PATCH 

(REAR 

, HWALL 

, \#1 ,\#6 

, \#2 , \#2 , \#9 , \#9 , \#1 , \#1 ) 

COVAL 

(REAR 

,U1  , 

GRND2 

9 

0.  OOOE+OO) 

COVAL 

(REAR 

,V1  , 

GRND2 

9 

0. OOOE+OO) 

PATCH 

(SUSFT 

, VOLUME ,\#1,\#3 

,\#3,\#3,\#4,\#4, 1,1) 

COVAL 

(SUSFT 

,TEM1, 

FIXVAL 

9 

1 . 023E+03) 

PATCH 

(SUSRT 

, VOLUME ,\#1,\#3 

, \#3 , \#3 , \#6 , \#6 ,1,1) 

COVAL 

(SUSRT 

,TEM1, 

FIXVAL 

9 

1 . 023E+03) 

PATCH 

(SUSST 

, VOLUME ,\#2,\#3 

, \#3 , \#3 , \#5 , \#5 , \#1 , \#1) 

COVAL 

(SUSST 

,TEM1, 

FIXVAL 

9 

1 . 023E+03) 

PATCH 

(WAFT 

, VOLUME ,\#1,\#1 

, \#3 , \#3 , \#5 , \#5 , \#1 , \#1) 

COVAL 

(WAFT 

,TEM1, 

FIXVAL 

9 

1 . 023E+03) 

PATCH 

(TOPT 

, SOUTH 

, \#1,\#3 

,  \#5 , \#5 , \#4 , \#6 , \#1 , \#1) 

COVAL 

(TOPT 

,TEM1, 

FIXVAL 

9 

7 . 230E+02) 

PATCH 

(BOTT 

, NORTH 

,\#1,\#3 

,  \#1 , \#1 , \#4 , \#6 , \#1 , \#1) 

COVAL 

(BOTT 

,TEM1, 

FIXVAL 

9 

7 . 230E+02) 

PATCH 

(SURFWAF 

, SOUTH 

,1,8,15, 

15 

,  23,30, \#1,\#1) 

COVAL 

(SURFWAF 

,P1  . 

1 . OOOE+OO , 

GRND1  ) 

COVAL 

(SURFWAF 

,  S80  , 

FIXFLU 

9 

GRND1  ) 

COVAL 

(SURFWAF 

,  S140 , 

FIXFLU 

9 

GRND1  ) 

COVAL 

(SURFWAF 

,  S142 , 

FIXFLU 

9 

GRND1  ) 

COVAL 

(SURFWAF 

,  S145 , 

FIXFLU 

9 

GRND1  ) 

COVAL 

(SURFWAF 

,  S147 , 

FIXFLU 

9 

GRND1  ) 

COVAL 

(SURFWAF 

,  S158 , 

FIXFLU 

9 

GRND1  ) 

COVAL 

(SURFWAF 

,TEM1, 

FIXFLU 

9 

GRND1  ) 

PATCH 

(RELT 

.PHASEM, 1,25,1,: 

27 

,1,52,1,1) 

COVAL 

(RELT 

,  S80  , 

GRND1 

9 

SAME  ) 

COVAL 

(RELT 

,  S140 , 

GRND1 

9 

SAME  ) 

COVAL 

(RELT 

,  S142 , 

GRND1 

9 

SAME  ) 

COVAL 

(RELT 

,  S145 , 

GRND1 

9 

SAME  ) 

COVAL 

(RELT 

,  S147 , 

GRND1 

9 

SAME  ) 

COVAL 

(RELT 

,  S158 , 

GRND1 

9 

SAME  ) 

PATCH 

(CHEM 

.VOLUME, 1,25,1, 

27 

,1,52,1,1) 

COVAL 

(CHEM 

,  S80  , 

GRND1 

9 

GRND1  ) 

COVAL 

(CHEM 

,  S140 , 

GRND1 

9 

GRND1  ) 

COVAL 

(CHEM 

,  S142 , 

GRND1 

9 

GRND1  ) 

367 


COVAL 

(CHEM 

, S145 , 

GRND1 

,  GRND1 

) 

COVAL 

(CHEM 

, S147 , 

GRND1 

,  GRND1 

) 

COVAL 

(CHEM 

, S158 , 

GRND1 

,  GRND1 

) 

COVAL 

(CHEM 

,TEM1 , 

GRND1 

,  GRND1 

) 

PATCH 

(BUOYANCY , PHASEM , \# 1 , \#NREGX , \# 1 , 

\#NREGY , \#1 , \#NREGZ , \#1 , \#NREGT) 

COVAL 

(BUOYANCY, U1  , 

FIXFLU 

,  GRND3 

) 

COVAL 

(BUOYANCY, VI  , 

FIXFLU 

,  GRND3 

) 

COVAL 

(BUOYANCY, W1  , 

FIXFLU 

,  GRND3 

) 

BUOYA 

=  0. 

OOOE+OO  : 

BUOYB  = 

-9 . 810E+00 

;  BUOYC  =  0. OOOE+OO 

BUOYD  =  GRNDIO 

BFCA  =  2.171E-03 

Group  14.  Downstream  Pressure  For  PARAB 

Group  15 .  Terminate  Sweeps 
L SWEEP  =  500 

SELREF  =  T 
RESFAC  =  1 . 000E-03 

Group  16.  Terminate  Iterations 

Group  17.  Relaxation 
RELAX (PI  , LINRLX,  7 . 000000E-01) 

RELAX (U1  , FALSDT,  2 . 703210E-02) 

RELAX (VI  , FALSDT,  2 . 703210E-02) 

RELAX (W1  , FALSDT,  2 . 703210E-02) 

RELAX (S 140, FALSDT,  2 . 703210E+02) 

RELAX (S 142, FALSDT,  2 . 703210E+02) 

RELAX (S 145, FALSDT,  2 . 703210E+02) 

RELAX (S 147, FALSDT,  2 . 703210E+02) 

RELAX (S158, FALSDT,  2 . 703210E+02) 

RELAX (TEM1, LINRLX,  3 . 000000E-01) 

Group  18.  Limits 

VARMAX (U1  )  =  1 . 000000E+03  ;VARMIN(U1  )  =-l . 000000E+03 

VARMAX (VI  )  =  1 . 000000E+03  ;VARMIN(V1  )  =-l . 000000E+03 

VARMAX (W1  )  =  1 . 000000E+03  ;VARMIN(W1  )  =-l . 000000E+03 

VARMAX (S80  )  =  1 . 000000E+00  ;VARMIN(S80  )  =  1 . 000000E-20 

VARMAX (S140)  =  1 . 000000E+00  ; VARMIN (S140)  =  1 . 000000E-20 
VARMAX (S142)  =  1 . 000000E+00  ; VARMIN (S142)  =  1 . 000000E-20 
VARMAX (S145)  =  1 . 000000E+00  ; VARMIN (S145)  =  1 . 000000E-20 
VARMAX (S147)  =  1 . 000000E+00  ; VARMIN (S147)  =  1 . 000000E-20 
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VARMAX (S158)  =  1 . OOOOOOE+OO  ; VARMIN (S158)  =  1.000000E-20 
VARMAX(TEMl)  =  3 . 000000E+03  ; VARMIN (TEM1)  =  2 . 600000E+02 

Group  19.  EARTH  Calls  To  GROUND  Station 
NAMGRD  =CVD 
CSG10  =  ’Q1’ 

SPEDAT (SET , CVD , THMDIF ,  L ,  T) 

SPEDAT ( SET , CVD , THMOPT ,1,1) 

SPEDAT ( SET , CVD , THMFRQ ,1,1) 

SPEDAT ( SET , CVD , THMRLX , R , 1 . OOOOOE+OO ) 

SPEDAT (SET , CVD , MCDOPT ,1,2) 

SPEDAT (SET , CVD , BINOPT ,1,4) 

SPEDAT (SET , CVD , MCPROP ,1,3) 

SPEDAT (SET , CVD , CHMRLX , R , 5 . OOOOOE-Ol) 

SPEDAT (SET , CVD , NGREAC ,1,5) 

SPEDAT (SET , CVD , GREAC (1) ,1,6) 

SPEDAT (SET , CVD , GREAC (2) , I , 7) 

SPEDAT (SET , CVD , GREAC (3) , I , 9) 

SPEDAT (SET , CVD , GREAC (4) , I , 10) 

SPEDAT (SET, CVD, GREAC (5) ,1,16) 

SPEDAT (SET , CVD , NSREAC ,1,5) 

SPEDAT(SET, CVD, SREAC(l) ,1,11) 

SPEDAT (SET , CVD , SREAC (2) , I , 12) 

SPEDAT (SET , CVD , SREAC (3) , I , 13) 

SPEDAT (SET , CVD , SREAC (4) , I , 14) 

SPEDAT (SET , CVD , SREAC (5) , I , 15) 

Group  20.  Preliminary  Printout 
ECHO  =  T 

Group  21.  Print-out  of  Variables 

Group  22.  Monitor  Print-Out 
IXMON  =  8  ; IYMON  =  16  ;IZMON  =  27 

NPRMNT  =  1 

TSTSWP  =  -1 

Group  23. Field  Print-Out  \&  Plot  Control 
No  PATCHes  used  for  this  Group 

Group  24.  Dumps  For  Restarts 
m 

STOP 
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